From 6b6d283d98c6381a90e62010c69845684ab5b0ba Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Adrian=20W=C3=A4lchli?= <aedu.waelchli@gmail.com>
Date: Tue, 14 Jul 2020 16:31:30 +0200
Subject: [PATCH] make it clear the example is under the hood (#2607)

---
 docs/source/multi_gpu.rst | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/docs/source/multi_gpu.rst b/docs/source/multi_gpu.rst
index 16c8cce22d..28cb392546 100644
--- a/docs/source/multi_gpu.rst
+++ b/docs/source/multi_gpu.rst
@@ -270,8 +270,7 @@ Distributed Data Parallel
     trainer = Trainer(gpus=8, distributed_backend='ddp', num_nodes=4)
 
 This Lightning implementation of DDP calls your script under the hood multiple times with the correct environment
-variables. If your code does not support this (ie: jupyter notebook, colab, or a nested script without a root package),
-use `dp` or `ddp_spawn`.
+variables:
 
 .. code-block:: bash
 
@@ -280,6 +279,8 @@ use `dp` or `ddp_spawn`.
     MASTER_ADDR=localhost MASTER_PORT=random() WORLD_SIZE=3 NODE_RANK=1 LOCAL_RANK=0 python my_file.py --gpus 3 --etc
     MASTER_ADDR=localhost MASTER_PORT=random() WORLD_SIZE=3 NODE_RANK=2 LOCAL_RANK=0 python my_file.py --gpus 3 --etc
 
+If your code does not support this (ie: jupyter notebook, colab, or a nested script without a root package),
+use `dp` or `ddp_spawn`.
 We use DDP this way because `ddp_spawn` has a few limitations (due to Python and PyTorch):
 
 1. Since `.spawn()` trains the model in subprocesses, the model on the main process does not get updated.