You can install using `pip <https://pypi.org/project/pytorch-lightning/>`_
..code-block:: bash
pip install pytorch-lightning
Or with `conda <https://anaconda.org/conda-forge/pytorch-lightning>`_ (see how to install conda `here <https://docs.conda.io/projects/conda/en/latest/user-guide/install/>`_):
Move your optimizers to :func:`pytorch_lightning.core.LightningModule.configure_optimizers` hook. Make sure to use the hook parameters (self in this case).
Lightning automates most of the trining for you, the epoch and batch iterations, all you need to keep is the training step logic. This should go into :func:`pytorch_lightning.core.LightningModule.training_step` hook (make sure to use the hook parameters, self in this case):
..code-block::
class LitModel(pl.LightningModule):
def training_step(self, batch, batch_idx):
x, y = batch
y_hat = self(x)
loss = F.cross_entropy(y_hat, y)
return loss
4. Find the val loop "meat"
-----------------------------
Lightning automates the validation (enabling gradients in the train loop and disabling in eval). To add an (optional) validation loop add logic to :func:`pytorch_lightning.core.LightningModule.validation_step` hook (make sure to use the hook parameters, self in this case):
..testcode::
class LitModel(LightningModule):
def validation_step(self, batch, batch_idx):
x, y = batch
y_hat = self(x)
val_loss = F.cross_entropy(y_hat, y)
return val_loss
5. Find the test loop "meat"
-----------------------------
You might also need an optional test loop. Add the following callback to your :class:`~pytorch_lightning.core.LightningModule`
..code-block::
class LitModel(pl.LightningModule):
def test_step(self, batch, batch_idx):
x, y = batch
y_hat = self(x)
loss = F.cross_entropy(y_hat, y)
result = pl.EvalResult()
result.log('test_loss', loss)
return result
..note:: The test loop is not automated in Lightning. You will need to specifically call test (this is done so you don't use the test set by mistake).
6. Remove any .cuda() or to.device() calls
------------------------------------------
Your :class:`~pytorch_lightning.core.LightningModule` can automatically run on any hardware!
7. Wrap loss in a TrainResult/EvalResult
----------------------------------------
Instead of returning the loss you can also use :class:`~pytorch_lightning.core.step_result.TrainResult` and :class:`~pytorch_lightning.core.step_result.EvalResult`, plain Dict objects that give you options for logging on every step and/or at the end of the epoch.
It also allows logging to the progress bar (by setting prog_bar=True). Read more in :ref:`result`.
# most basic trainer, uses good defaults (auto-tensorboard, checkpoints, logs, and more)
trainer = pl.Trainer()
trainer.fit(model, train_loader, val_loader)
Init :class:`~pytorch_lightning.core.LightningModule`, your PyTorch dataloaders, and then the PyTorch Lightning :class:`~pytorch_lightning.trainer.Trainer`.
The :class:`~pytorch_lightning.trainer.Trainer` will automate:
* The epoch iteration
* The batch iteration
* The calling of optimizer.step()
*:ref:`weights-loading`
* Logging to Tensorboard (see :ref:`loggers` options)
*:ref:`multi-gpu-training` support
*:ref:`tpu`
*:ref:`16-bit` support
All automated code is rigorously tested and benchmarked.
Check out more flags in the :ref:`trainer` docs.
Using CPUs/GPUs/TPUs
====================
It's trivial to use CPUs, GPUs or TPUs in Lightning. There's NO NEED to change your code, simply change the :class:`~pytorch_lightning.trainer.Trainer` options.
..code-block:: python
# train on 1024 CPUs across 128 machines
trainer = pl.Trainer(
num_processes=8,
num_nodes=128
)
..code-block:: python
# train on 1 GPU
trainer = pl.Trainer(gpus=1)
..code-block:: python
# train on 256 GPUs
trainer = pl.Trainer(
gpus=8,
num_nodes=32
)
..code-block:: python
# Multi GPU with mixed precision
trainer = pl.Trainer(gpus=2, precision=16)
..code-block:: python
# Train on TPUs
trainer = pl.Trainer(tpu_cores=8)
Without changing a SINGLE line of your code, you can now do the following with the above code:
..code-block:: python
# train on TPUs using 16 bit precision with early stopping
# using only half the training data and checking validation every quarter of a training epoch
-`Automatic early stopping <https://pytorch-lightning.readthedocs.io/en/stable/early_stopping.html>`_
-`Add custom callbacks <https://pytorch-lightning.readthedocs.io/en/stable/callbacks.html>`_ (self-contained programs that can be reused across projects)
-`Dry run mode <https://pytorch-lightning.readthedocs.io/en/stable/debugging.html#fast-dev-run>`_ (Hit every line of your code once to see if you have bugs, instead of waiting hours to crash on validation ;)
-`Automatically overfit your model for a sanity test <https://pytorch-lightning.readthedocs.io/en/stable/debugging.html?highlight=overfit#make-model-overfit-on-subset-of-data>`_
-`Automatically scale your batch size <https://pytorch-lightning.readthedocs.io/en/stable/training_tricks.html?highlight=batch%20size#auto-scaling-of-batch-size>`_
-`Automatically find a good learning rate <https://pytorch-lightning.readthedocs.io/en/stable/lr_finder.html>`_
-`Load checkpoints directly from S3 <https://pytorch-lightning.readthedocs.io/en/stable/weights_loading.html#checkpoint-loading>`_
-`Profile your code for speed/memory bottlenecks <https://pytorch-lightning.readthedocs.io/en/stable/profiler.html>`_
-`Scale to massive compute clusters <https://pytorch-lightning.readthedocs.io/en/stable/slurm.html>`_
-`Use multiple dataloaders per train/val/test loop <https://pytorch-lightning.readthedocs.io/en/stable/multiple_loaders.html>`_
-`Use multiple optimizers to do Reinforcement learning or even GANs <https://pytorch-lightning.readthedocs.io/en/stable/optimizers.html?highlight=multiple%20optimizers#use-multiple-optimizers-like-gans>`_
Or read our :ref:`introduction-guide` to learn more!
-------------
Masterclass
===========
Go pro by tunning in to our Masterclass! New episodes every week.