# PYTORCH-LIGHTNING DOCUMENTATION

###### Main Docs
- [LightningModule](Pytorch-Lightning/LightningModule)  
- [Trainer](Trainer/)  

###### New project Quick Start
1. [Define a LightningModule](https://github.com/williamFalcon/pytorch-lightning/blob/master/examples/new_project_templates/lightning_module_template.py)  
2. Pick a trainer      
    - [Basic CPU Trainer](https://github.com/williamFalcon/pytorch-lightning/blob/master/examples/new_project_templates/trainer_cpu_template.py) 
    - [GPU cluster Trainer](https://github.com/williamFalcon/pytorch-lightning/blob/master/examples/new_project_templates/trainer_gpu_cluster_template.py)

###### Quick start examples 
- CPU example   
- Single GPU example   
- Multi-gpu example 
- SLURM cluster grid search example      

###### Training loop
- Accumulate gradients
- Check GPU usage
- Check which gradients are nan
- Check validation every n epochs
- Display metrics in progress bar
- Force training for min or max epochs
- Inspect gradient norms
- Hooks
- Learning rate annealing
- Make model overfit on subset of data
- Multiple optimizers (like GANs)
- Set how much of the training set to check (1-100%)
- training_step function

###### Validation loop
- Display metrics in progress bar
- hooks
- Set how much of the validation set to check (1-100%)
- Set validation check frequency within 1 training epoch (1-100%)
- validation_step function
- Why does validation run first for 5 steps?

###### Distributed training
- Single-gpu      
- Multi-gpu      
- Multi-node   
- 16-bit mixed precision

###### Checkpointing
- Model saving
- Model loading 

###### Computing cluster (SLURM)
- Automatic checkpointing   
- Automatic saving, loading  
- Running grid search on a cluster 
- Walltime auto-resubmit