[Fix] Ensure we set the default device before initializing deepspeed (#6460)

* Ensure we set the default device before initializing deepspeed

* Add CHANGELOG.md

* Update pytorch_lightning/plugins/training_type/deepspeed.py

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
This commit is contained in:
Sean Naren 2021-03-10 16:29:37 +00:00 committed by GitHub
parent 7d4e74c745
commit 1c013b43e0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 5 additions and 0 deletions

View File

@ -122,6 +122,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Fixed logger creating directory structure too early in DDP ([#6380](https://github.com/PyTorchLightning/pytorch-lightning/pull/6380))
- Fixed DeepSpeed additional memory use on rank 0 when default device not set early enough ([#6460](https://github.com/PyTorchLightning/pytorch-lightning/pull/6460))
- Fixed LightningModule `all_gather` on cpu tensors ([#6416](https://github.com/PyTorchLightning/pytorch-lightning/pull/6416))

View File

@ -231,6 +231,8 @@ class DeepSpeedPlugin(DDPPlugin):
return optimizer, scheduler, optimizer_frequencies
def _initialize_deepspeed_train(self, model):
if self.on_gpu:
torch.cuda.set_device(self.root_device)
optimizer, lightning_scheduler, optimizer_frequencies = None, None, None
if "optimizer" not in self.config:
rank_zero_info(