From 6b6d283d98c6381a90e62010c69845684ab5b0ba Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adrian=20W=C3=A4lchli?= Date: Tue, 14 Jul 2020 16:31:30 +0200 Subject: [PATCH] make it clear the example is under the hood (#2607) --- docs/source/multi_gpu.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/source/multi_gpu.rst b/docs/source/multi_gpu.rst index 16c8cce22d..28cb392546 100644 --- a/docs/source/multi_gpu.rst +++ b/docs/source/multi_gpu.rst @@ -270,8 +270,7 @@ Distributed Data Parallel trainer = Trainer(gpus=8, distributed_backend='ddp', num_nodes=4) This Lightning implementation of DDP calls your script under the hood multiple times with the correct environment -variables. If your code does not support this (ie: jupyter notebook, colab, or a nested script without a root package), -use `dp` or `ddp_spawn`. +variables: .. code-block:: bash @@ -280,6 +279,8 @@ use `dp` or `ddp_spawn`. MASTER_ADDR=localhost MASTER_PORT=random() WORLD_SIZE=3 NODE_RANK=1 LOCAL_RANK=0 python my_file.py --gpus 3 --etc MASTER_ADDR=localhost MASTER_PORT=random() WORLD_SIZE=3 NODE_RANK=2 LOCAL_RANK=0 python my_file.py --gpus 3 --etc +If your code does not support this (ie: jupyter notebook, colab, or a nested script without a root package), +use `dp` or `ddp_spawn`. We use DDP this way because `ddp_spawn` has a few limitations (due to Python and PyTorch): 1. Since `.spawn()` trains the model in subprocesses, the model on the main process does not get updated.