lightning/pl_examples/basic_examples
Roshan Rao 4ed3027309
Set precision=16 when use_amp is passed as True (#1145)
* Set precision=16 when use_amp is passed as True

* Update CHANGELOG.md

* add use_amp to deprecated API

* Update trainer.py

* Update trainer.py

* move the use_amp attribute to deprecated API

* move use_amp deprecation back to Trainer's __init__

* drop unsed

* drop deprecated

* reorder imports

* typing

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-06 08:13:24 -04:00
..
README.md simplify examples structure (#1247) 2020-04-03 17:57:34 -04:00
__init__.py changes examples to pl_examples for name connflict 2019-10-19 00:41:17 +02:00
cpu_template.py simplify examples structure (#1247) 2020-04-03 17:57:34 -04:00
gpu_template.py Set precision=16 when use_amp is passed as True (#1145) 2020-04-06 08:13:24 -04:00
multi_node_ddp2_demo.py simplify examples structure (#1247) 2020-04-03 17:57:34 -04:00
multi_node_ddp_demo.py simplify examples structure (#1247) 2020-04-03 17:57:34 -04:00
submit_ddp2_job.sh simplify examples structure (#1247) 2020-04-03 17:57:34 -04:00
submit_ddp_job.sh simplify examples structure (#1247) 2020-04-03 17:57:34 -04:00

README.md

Basic Examples

Use these examples to test how lightning works.

Test on CPU

python cpu_template.py

Train on a single GPU

python gpu_template.py --gpus 1

DataParallel (dp)

Train on multiple GPUs using DataParallel.

python gpu_template.py --gpus 2 --distributed_backend dp

DistributedDataParallel (ddp)

Train on multiple GPUs using DistributedDataParallel

python gpu_template.py --gpus 2 --distributed_backend ddp

DistributedDataParallel+DP (ddp2)

Train on multiple GPUs using DistributedDataParallel + DataParallel. On a single node, uses all GPUs for 1 model. Then shares gradient information across nodes.

python gpu_template.py --gpus 2 --distributed_backend ddp2

Multi-node example

This demo launches a job using 2 GPUs on 2 different nodes (4 GPUs total). To run this demo do the following:

  1. Log into the jumphost node of your SLURM-managed cluster.
  2. Create a conda environment with Lightning and a GPU PyTorch version.
  3. Choose a script to submit

DDP

Submit this job to run with DistributedDataParallel (2 nodes, 2 gpus each)

sbatch ddp_job_submit.sh YourEnv

DDP2

Submit this job to run with a different implementation of DistributedDataParallel. In this version, each node acts like DataParallel but syncs across nodes like DDP.

sbatch ddp2_job_submit.sh YourEnv