diff --git a/docs/source-pytorch/fabric/fabric.rst b/docs/source-pytorch/fabric/fabric.rst index 8744fae2ba..1bf32a5ed9 100644 --- a/docs/source-pytorch/fabric/fabric.rst +++ b/docs/source-pytorch/fabric/fabric.rst @@ -2,19 +2,15 @@ Fabric (Beta) ############# -Fabric allows you to scale any PyTorch model with just a few lines of code! -With Fabric, you can easily scale your model to run on distributed devices using the strategy of your choice while keeping complete control over the training loop and optimization logic. +Fabric is the fast and lightweight way to scale PyTorch models without boilerplate code. -With only a few changes to your code, Fabric allows you to: - -- Automatic placement of models and data onto the device -- Automatic support for mixed precision (speedup and smaller memory footprint) -- Seamless switching between hardware (CPU, GPU, TPU) -- State-of-the-art distributed training strategies (DDP, FSDP, DeepSpeed) -- Easy-to-use launch command for spawning processes (DDP, torchelastic, etc) -- Multi-node support (TorchElastic, SLURM, and more) -- You keep complete control of your training loop +- Easily switch from running on CPU to GPU (Apple Silicon, CUDA, ...), TPU, multi-GPU or even multi-node training +- State-of-the-art distributed training strategies (DDP, FSDP, DeepSpeed) and mixed precision out of the box +- Handles all the boilerplate device logic for you +- Brings useful tools to help you build a trainer (callbacks, logging, checkpoints, ...) +- Designed with multi-billion parameter models in mind +| .. code-block:: diff @@ -60,6 +56,32 @@ With only a few changes to your code, Fabric allows you to: ---- +*********** +Why Fabric? +*********** + +Fabric differentiates itself from a fully-fledged trainer like :doc:`Lightning Trainer <../common/trainer>` in these key aspects: + +**Fast to implement** +There is no need to restructure your code: Just change a few lines in the PyTorch script and you'll be able to leverage Fabric features. + +**Maximum Flexibility** +Write your own training and/or inference logic down to the individual optimizer calls. +You aren't forced to conform to a standardized epoch-based training loop like the one in :doc:`Lightning Trainer <../common/trainer>`. +You can do flexible iteration based training, meta-learning, cross-validation and other types of optimization algorithms without digging into framework internals. +This also makes it super easy to adopt Fabric in existing PyTorch projects to speed-up and scale your models without the compromise on large refactors. +Just remember: With great power comes a great responsibility. + +**Maximum Control** +The :doc:`Lightning Trainer <../common/trainer>` has many built in features to make research simpler with less boilerplate, but debugging it requires some familiarity with the framework internals. +In Fabric, everything is opt-in. Think of it as a toolbox: You take out the tools (Fabric functions) you need and leave the other ones behind. +This makes it easier to develop and debug your PyTorch code as you gradually add more features to it. +Fabric provides important tools to remove undesired boilerplate code (distributed, hardware, checkpoints, logging, ...), but leaves the design and orchestration fully up to you. + + +---- + + ************ Fundamentals ************