Make DDP and Horovod batch_size scaling examples explicit (#9813)

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-05 09:12:26 +02:00 · 2021-10-05 09:12:26 +02:00 · 9317fbfc25
parent 3392215ef6
commit 9317fbfc25
1 changed files with 5 additions and 4 deletions
--- a/docs/source/advanced/multi_gpu.rst
+++ b/docs/source/advanced/multi_gpu.rst
@ -611,16 +611,17 @@ Let's say you have a batch size of 7 in your dataloader.
        def train_dataloader(self):
            return Dataset(..., batch_size=7)

-In (DDP, Horovod) your effective batch size will be 7 * gpus * num_nodes.
+In DDP or Horovod your effective batch size will be 7 * gpus * num_nodes.

 .. code-block:: python

    # effective batch size = 7 * 8
-    Trainer(gpus=8, accelerator="ddp|horovod")
+    Trainer(gpus=8, accelerator="ddp")
+    Trainer(gpus=8, accelerator="horovod")

    # effective batch size = 7 * 8 * 10
-    Trainer(gpus=8, num_nodes=10, accelerator="ddp|horovod")
-
+    Trainer(gpus=8, num_nodes=10, accelerator="ddp")
+    Trainer(gpus=8, num_nodes=10, accelerator="horovod")

 In DDP2, your effective batch size will be 7 * num_nodes.
 The reason is that the full batch is visible to all GPUs on the node when using DDP2.