Clarify setup of optimizer when using `empty_init=True` (#19067)

This commit is contained in:
Adrian Wälchli 2023-11-26 11:04:36 +01:00 committed by GitHub
parent af852ff590
commit b79b68481e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 4 additions and 0 deletions

View File

@ -75,6 +75,10 @@ When training sharded models with :doc:`FSDP <model_parallel/fsdp>` or DeepSpeed
model = fabric.setup(model) # parameters get sharded and initialized at once
# Make sure to create the optimizer only after the model has been set up
optimizer = torch.optim.Adam(model.parameters())
optimizer = fabric.setup_optimizers(optimizer)
.. note::
Empty-init is experimental and the behavior may change in the future.
For FSDP on PyTorch 2.1+, it is required that all user-defined modules that manage parameters implement a ``reset_parameters()`` method (all PyTorch built-in modules have this too).