2019-06-27 18:22:00 +00:00
|
|
|
Lightning makes multi-gpu training and 16 bit training trivial.
|
|
|
|
|
|
|
|
*Note:*
|
|
|
|
None of the flags below require changing anything about your lightningModel definition.
|
|
|
|
|
|
|
|
---
|
|
|
|
#### 16-bit mixed precision
|
|
|
|
16 bit precision can cut your memory footprint by half. If using volta architecture GPUs it can give a dramatic training speed-up as well.
|
|
|
|
First, install apex (if install fails, look [here](https://github.com/NVIDIA/apex)):
|
|
|
|
```bash
|
|
|
|
$ git clone https://github.com/NVIDIA/apex
|
|
|
|
$ cd apex
|
|
|
|
$ pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
|
|
|
|
```
|
|
|
|
|
|
|
|
then set this use_amp to True.
|
|
|
|
``` {.python}
|
|
|
|
# DEFAULT
|
|
|
|
trainer = Trainer(amp_level='O2', use_amp=False)
|
|
|
|
```
|
|
|
|
|
|
|
|
---
|
|
|
|
#### Single-gpu
|
|
|
|
Make sure you're on a GPU machine.
|
|
|
|
```python
|
|
|
|
# set these flags
|
|
|
|
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
|
|
|
|
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
|
|
|
|
|
|
|
|
# DEFAULT
|
|
|
|
trainer = Trainer(gpus=[0])
|
|
|
|
```
|
|
|
|
|
|
|
|
---
|
|
|
|
#### multi-gpu
|
|
|
|
Make sure you're on a GPU machine. You can set as many GPUs as you want.
|
|
|
|
In this setting, the model will run on all 8 GPUs at once using DataParallel under the hood.
|
|
|
|
```python
|
|
|
|
# set these flags
|
|
|
|
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
|
|
|
|
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1,2,3,4,5,6,7"
|
|
|
|
|
|
|
|
# DEFAULT
|
|
|
|
trainer = Trainer(gpus=[0,1,2,3,4,5,6,7])
|
|
|
|
```
|
|
|
|
|
|
|
|
---
|
|
|
|
#### Multi-node
|
|
|
|
COMING SOON.
|
|
|
|
|
|
|
|
---
|
|
|
|
#### Self-balancing architecture
|
|
|
|
Here lightning distributes parts of your module across available GPUs to optimize for speed and memory.
|
|
|
|
|
|
|
|
COMING SOON.
|