From d79eaae33423d081ffbaae08b405d29a32d1d528 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adrian=20W=C3=A4lchli?= Date: Mon, 17 Jul 2023 18:02:54 +0200 Subject: [PATCH] Update deepspeed model-parallel docs (#18091) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> --- docs/source-pytorch/advanced/model_parallel.rst | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/docs/source-pytorch/advanced/model_parallel.rst b/docs/source-pytorch/advanced/model_parallel.rst index 2408ca9376..04f8fc2d2a 100644 --- a/docs/source-pytorch/advanced/model_parallel.rst +++ b/docs/source-pytorch/advanced/model_parallel.rst @@ -187,12 +187,10 @@ Here's an example using that uses ``wrap`` to create your model: class MyModel(pl.LightningModule): - def __init__(self): - super().__init__() + def configure_model(self): self.linear_layer = nn.Linear(32, 32) self.block = nn.Sequential(nn.Linear(32, 32), nn.Linear(32, 32)) - def configure_model(self): # modules are sharded across processes # as soon as they are wrapped with `wrap`. # During the forward/backward passes, weights get synced across processes