History

William Falcon b78c3d4da8 Fix weights path (#1445 ) * renamed default path to actual root_dir * added default weights path * added default weights path * added default weights path		2020-04-10 12:02:59 -04:00
..
basic_examples	Set precision=16 when use_amp is passed as True (#1145 )	2020-04-06 08:13:24 -04:00
domain_templates	Fix weights path (#1445 )	2020-04-10 12:02:59 -04:00
models	simplify examples structure (#1247 )	2020-04-03 17:57:34 -04:00
README.md	simplify examples structure (#1247 )	2020-04-03 17:57:34 -04:00
__init__.py	simplify examples structure (#1247 )	2020-04-03 17:57:34 -04:00
requirements.txt	Example: Simple RL example using DQN/Lightning (#1232 )	2020-03-28 16:10:53 -04:00

README.md

Examples

This folder has 3 sections:

Basic Examples

Use these examples to test how lightning works.

Test on CPU

python cpu_template.py

Train on a single GPU

python gpu_template.py --gpus 1

DataParallel (dp)

Train on multiple GPUs using DataParallel.

python gpu_template.py --gpus 2 --distributed_backend dp

DistributedDataParallel (ddp)

Train on multiple GPUs using DistributedDataParallel

python gpu_template.py --gpus 2 --distributed_backend ddp

DistributedDataParallel+DP (ddp2)

Train on multiple GPUs using DistributedDataParallel + dataparallel. On a single node, uses all GPUs for 1 model. Then shares gradient information across nodes.

python gpu_template.py --gpus 2 --distributed_backend ddp2

Multi-node example

This demo launches a job using 2 GPUs on 2 different nodes (4 GPUs total). To run this demo do the following:

Log into the jumphost node of your SLURM-managed cluster.
Create a conda environment with Lightning and a GPU PyTorch version.
Choose a script to submit

DDP

Submit this job to run with DistributedDataParallel (2 nodes, 2 gpus each)

sbatch ddp_job_submit.sh YourEnv

DDP2

Submit this job to run with a different implementation of DistributedDataParallel. In this version, each node acts like DataParallel but syncs across nodes like DDP.

sbatch ddp2_job_submit.sh YourEnv

Domain templates

These are templates to show common approaches such as GANs and RL.