2019-06-26 23:18:41 +00:00
# PYTORCH-LIGHTNING DOCUMENTATION
2019-06-27 18:29:44 +00:00
###### Main Docs
- [LightningModule ](Pytorch-Lightning/LightningModule )
- [Trainer ](Trainer/ )
###### New project Quick Start
2019-06-27 18:32:24 +00:00
1. [Define a LightningModule ](https://github.com/williamFalcon/pytorch-lightning/blob/master/examples/new_project_templates/lightning_module_template.py )
2. Pick a trainer
- [Basic CPU Trainer ](https://github.com/williamFalcon/pytorch-lightning/blob/master/examples/new_project_templates/trainer_cpu_template.py )
- [GPU cluster Trainer ](https://github.com/williamFalcon/pytorch-lightning/blob/master/examples/new_project_templates/trainer_gpu_cluster_template.py )
2019-06-27 00:07:28 +00:00
2019-06-27 00:15:18 +00:00
###### Quick start examples
2019-06-26 23:18:41 +00:00
- CPU example
- Single GPU example
- Multi-gpu example
2019-06-27 12:31:39 +00:00
- SLURM cluster grid search example
2019-06-26 23:18:41 +00:00
2019-06-27 12:31:39 +00:00
###### Training loop
- Accumulate gradients
- Check GPU usage
- Check which gradients are nan
- Check validation every n epochs
- Display metrics in progress bar
- Force training for min or max epochs
- Inspect gradient norms
- Hooks
- Learning rate annealing
- Make model overfit on subset of data
- Multiple optimizers (like GANs)
- Set how much of the training set to check (1-100%)
- training_step function
###### Validation loop
- Display metrics in progress bar
- hooks
- Set how much of the validation set to check (1-100%)
- Set validation check frequency within 1 training epoch (1-100%)
- validation_step function
- Why does validation run first for 5 steps?
2019-06-26 23:18:41 +00:00
###### Distributed training
- Single-gpu
- Multi-gpu
- Multi-node
2019-06-27 12:31:39 +00:00
- 16-bit mixed precision
2019-06-26 23:18:41 +00:00
2019-06-27 00:07:28 +00:00
###### Checkpointing
- Model saving
- Model loading
2019-06-26 23:18:41 +00:00
###### Computing cluster (SLURM)
- Automatic checkpointing
2019-06-27 12:31:39 +00:00
- Automatic saving, loading
- Running grid search on a cluster
2019-06-27 00:15:18 +00:00
- Walltime auto-resubmit