diff --git a/docs/source/multi_gpu.rst b/docs/source/multi_gpu.rst index e528bab3bb..1d696a7ec8 100644 --- a/docs/source/multi_gpu.rst +++ b/docs/source/multi_gpu.rst @@ -260,7 +260,7 @@ Distributed Data Parallel 3. Each process inits the model. -.. note:: Make sure to set the random seed so that each model initializes with the same weights. +.. note:: Make sure to set the random seed before the instantiation of a ``Trainer()`` so that each model initializes with the same weights. 4. Each process performs a full forward and backward pass in parallel.