History

Akihiro Nitta 9876df16a2 [docs] Update Bolts link (#6743 ) * Update Bolts link * Update Bolts link * formt Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>		2021-03-30 22:52:59 +05:30
..
README.md	Basic examples fixes (#5912 )	2021-02-16 19:31:07 +00:00
__init__.py	changes examples to pl_examples for name connflict	2019-10-19 00:41:17 +02:00
autoencoder.py	fix: update example autoencoder.py to reflect args (#6638 )	2021-03-24 08:27:08 +00:00
backbone_image_classifier.py	argparse: Add use_argument_group=True (#6088 )	2021-03-11 10:50:49 -05:00
conv_sequential_example.py	[docs] Update Bolts link (#6743 )	2021-03-30 22:52:59 +05:30
dali_image_classifier.py	argparse: Add use_argument_group=True (#6088 )	2021-03-11 10:50:49 -05:00
mnist_datamodule.py	argparse: Add use_argument_group=True (#6088 )	2021-03-11 10:50:49 -05:00
profiler_example.py	Add PyTorch 1.8 Profiler 5/5 (#6618 )	2021-03-23 20:43:21 +00:00
simple_image_classifier.py	remove obsolete todo in pl_examples (#6475 )	2021-03-11 18:49:30 +01:00
submit_ddp2_job.sh	fixing examples (#6600 )	2021-03-20 18:58:59 +00:00
submit_ddp_job.sh	fixing examples (#6600 )	2021-03-20 18:58:59 +00:00

README.md

Basic Examples

Use these examples to test how lightning works.

MNIST

Trains MNIST where the model is defined inside the LightningModule.

# cpu
python simple_image_classifier.py

# gpus (any number)
python simple_image_classifier.py --gpus 2

# dataparallel
python simple_image_classifier.py --gpus 2 --distributed_backend 'dp'

MNIST with DALI

The MNIST example above using NVIDIA DALI. Requires NVIDIA DALI to be installed based on your CUDA version, see here.

python dali_image_classifier.py

Image classifier

Generic image classifier with an arbitrary backbone (ie: a simple system)

# cpu
python backbone_image_classifier.py

# gpus (any number)
python backbone_image_classifier.py --gpus 2

# dataparallel
python backbone_image_classifier.py --gpus 2 --distributed_backend 'dp'

Autoencoder

Showing the power of a system... arbitrarily complex training loops

# cpu
python autoencoder.py

# gpus (any number)
python autoencoder.py --gpus 2

# dataparallel
python autoencoder.py --gpus 2 --distributed_backend 'dp'

Multi-node example

This demo launches a job using 2 GPUs on 2 different nodes (4 GPUs total). To run this demo do the following:

Log into the jumphost node of your SLURM-managed cluster.
Create a conda environment with Lightning and a GPU PyTorch version.
Choose a script to submit

DDP

Submit this job to run with DistributedDataParallel (2 nodes, 2 gpus each)

sbatch submit_ddp_job.sh YourEnv

DDP2

Submit this job to run with a different implementation of DistributedDataParallel. In this version, each node acts like DataParallel but syncs across nodes like DDP.

sbatch submit_ddp2_job.sh YourEnv