lightning/docs/source/extensions/plugins.rst

.. _plugins:

#######
Plugins
#######

.. include:: ../links.rst

Plugins allow custom integrations to the internals of the Trainer such as a custom precision or
distributed implementation.

Under the hood, the Lightning Trainer is using plugins in the training routine, added automatically
depending on the provided Trainer arguments. For example:

.. code-block:: python

    # accelerator: GPUAccelerator
    # training type: DDPPlugin
    # precision: NativeMixedPrecisionPlugin
    trainer = Trainer(gpus=4, precision=16)


We expose Accelerators and Plugins mainly for expert users that want to extend Lightning for:

- New hardware (like TPU plugin)
- Distributed backends (e.g. a backend not yet supported by
  `PyTorch <https://pytorch.org/docs/stable/distributed.html#backends>`_ itself)
- Clusters (e.g. customized access to the cluster's environment interface)

There are two types of Plugins in Lightning with different responsibilities:

TrainingTypePlugin
------------------

- Launching and teardown of training processes (if applicable)
- Setup communication between processes (NCCL, GLOO, MPI, ...)
- Provide a unified communication interface for reduction, broadcast, etc.
- Provide access to the wrapped LightningModule

PrecisionPlugin
---------------

- Perform pre- and post backward/optimizer step operations such as scaling gradients
- Provide context managers for forward, training_step, etc.
- Gradient clipping


Futhermore, for multi-node training Lightning provides cluster environment plugins that allow the advanced user
to configure Lightning to integrate with a :ref:`custom-cluster`.


.. image:: ../_static/images/accelerator/overview.svg


**********************
Create a custom plugin
**********************

Expert users may choose to extend an existing plugin by overriding its methods ...

.. code-block:: python

    from pytorch_lightning.plugins import DDPPlugin


    class CustomDDPPlugin(DDPPlugin):
        def configure_ddp(self):
            self._model = MyCustomDistributedDataParallel(
                self.model,
                device_ids=...,
            )

or by subclassing the base classes :class:`~pytorch_lightning.plugins.training_type.TrainingTypePlugin` or
:class:`~pytorch_lightning.plugins.precision.PrecisionPlugin` to create new ones. These custom plugins
can then be passed into the Trainer directly or via a (custom) accelerator:

.. code-block:: python

    # custom plugins
    trainer = Trainer(strategy=CustomDDPPlugin(), plugins=[CustomPrecisionPlugin()])

    # fully custom accelerator and plugins
    accelerator = MyAccelerator()
    precision_plugin = MyPrecisionPlugin()
    training_type_plugin = CustomDDPPlugin(accelerator=accelerator, precision_plugin=precision_plugin)
    trainer = Trainer(strategy=training_type_plugin)


The full list of built-in plugins is listed below.


.. warning:: The Plugin API is in beta and subject to change.
    For help setting up custom plugins/accelerators, please reach out to us at **support@pytorchlightning.ai**


----------


Training Type Plugins
---------------------

.. currentmodule:: pytorch_lightning.plugins.training_type

.. autosummary::
    :nosignatures:
    :template: classtemplate.rst

    TrainingTypePlugin
    SingleDevicePlugin
    ParallelPlugin
    DataParallelPlugin
    DDPPlugin
    DDP2Plugin
    DDPShardedPlugin
    DDPSpawnShardedPlugin
    DDPSpawnPlugin
    DeepSpeedPlugin
    HorovodPlugin
    SingleTPUPlugin
    TPUSpawnPlugin


Precision Plugins
-----------------

.. currentmodule:: pytorch_lightning.plugins.precision

.. autosummary::
    :nosignatures:
    :template: classtemplate.rst

    PrecisionPlugin
    MixedPrecisionPlugin
    NativeMixedPrecisionPlugin
    ShardedNativeMixedPrecisionPlugin
    ApexMixedPrecisionPlugin
    DeepSpeedPrecisionPlugin
    TPUPrecisionPlugin
    TPUBf16PrecisionPlugin
    DoublePrecisionPlugin
    FullyShardedNativeMixedPrecisionPlugin
    IPUPrecisionPlugin


Cluster Environments
--------------------

.. currentmodule:: pytorch_lightning.plugins.environments

.. autosummary::
    :nosignatures:
    :template: classtemplate.rst

    ClusterEnvironment
    LightningEnvironment
    LSFEnvironment
    TorchElasticEnvironment
    KubeflowEnvironment
    SLURMEnvironment
Accelerator API docs (#6936) Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> 2021-04-10 06:55:07 +00:00			`.. _plugins:`

enable ddp as a plugin (#4285) * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin Co-authored-by: chaton <thomas@grid.ai> 2020-10-22 09:15:51 +00:00			`#######`
			`Plugins`
			`#######`

CI: precommit - docformatter (#8584) * CI: precommit - docformatter * fix deprecated Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> 2021-09-06 12:49:09 +00:00			`.. include:: ../links.rst`

Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00			`Plugins allow custom integrations to the internals of the Trainer such as a custom precision or`
			`distributed implementation.`
enable ddp as a plugin (#4285) * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin * enable custom ddp plugin Co-authored-by: chaton <thomas@grid.ai> 2020-10-22 09:15:51 +00:00
Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00			`Under the hood, the Lightning Trainer is using plugins in the training routine, added automatically`
			`depending on the provided Trainer arguments. For example:`

			`.. code-block:: python`

			`# accelerator: GPUAccelerator`
			`# training type: DDPPlugin`
			`# precision: NativeMixedPrecisionPlugin`
			`trainer = Trainer(gpus=4, precision=16)`


			`We expose Accelerators and Plugins mainly for expert users that want to extend Lightning for:`

			`- New hardware (like TPU plugin)`
			`- Distributed backends (e.g. a backend not yet supported by`
			`PyTorch <https://pytorch.org/docs/stable/distributed.html#backends>`_ itself)
			`- Clusters (e.g. customized access to the cluster's environment interface)`

			`There are two types of Plugins in Lightning with different responsibilities:`

			`TrainingTypePlugin`
			`------------------`

			`- Launching and teardown of training processes (if applicable)`
			`- Setup communication between processes (NCCL, GLOO, MPI, ...)`
			`- Provide a unified communication interface for reduction, broadcast, etc.`
			`- Provide access to the wrapped LightningModule`

			`PrecisionPlugin`
			`---------------`

			`- Perform pre- and post backward/optimizer step operations such as scaling gradients`
			`- Provide context managers for forward, training_step, etc.`
			`- Gradient clipping`


Update `ClusterEnvironment` docs (#7120) * update cluster * update index * update references * update grid docs * update duplicated title * Update docs/source/clouds/cloud_training.rst Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * fix doctest * Remove self-balancing section Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-04-21 23:38:16 +00:00			`Futhermore, for multi-node training Lightning provides cluster environment plugins that allow the advanced user`
Update extensions doc (#10778) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-11-29 20:38:23 +00:00			to configure Lightning to integrate with a :ref:`custom-cluster`.
Update `ClusterEnvironment` docs (#7120) * update cluster * update index * update references * update grid docs * update duplicated title * Update docs/source/clouds/cloud_training.rst Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * fix doctest * Remove self-balancing section Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-04-21 23:38:16 +00:00

Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00			`.. image:: ../_static/images/accelerator/overview.svg`


			`**********************`
			`Create a custom plugin`
			`**********************`

			`Expert users may choose to extend an existing plugin by overriding its methods ...`

			`.. code-block:: python`

			`from pytorch_lightning.plugins import DDPPlugin`


CI: black docs (#8566) * black docs Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> 2021-07-28 16:08:31 +00:00			`class CustomDDPPlugin(DDPPlugin):`
Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00			`def configure_ddp(self):`
			`self._model = MyCustomDistributedDataParallel(`
			`self.model,`
			`device_ids=...,`
			`)`

			or by subclassing the base classes :class:`~pytorch_lightning.plugins.training_type.TrainingTypePlugin` or
			:class:`~pytorch_lightning.plugins.precision.PrecisionPlugin` to create new ones. These custom plugins
			`can then be passed into the Trainer directly or via a (custom) accelerator:`

			`.. code-block:: python`

			`# custom plugins`
Update strategy flag in docs (#10000) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> 2021-10-20 15:32:53 +00:00			`trainer = Trainer(strategy=CustomDDPPlugin(), plugins=[CustomPrecisionPlugin()])`
Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00
			`# fully custom accelerator and plugins`
3/n Move accelerator into Strategy (#11022) * remove training_step() from accelerator * remove test, val, predict step * move * wip * accelerator references * cpu training * rename occurrences in tests * update tests * pull from adrian's commit * fix changelog merge pro * fix accelerator_connector and other updates * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix doc build and some mypy * fix lite * fix gpu setup environment * support customized ttp and accelerator * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tpu error check * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix precision_plugin initialization to recognisze cusomized plugin * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update bug_report_model.py * Update accelerator_connector.py * update changelog * allow shorthand typing references to pl.Accelerator * rename helper method and add docstring * fix typing * Update pytorch_lightning/trainer/connectors/accelerator_connector.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update tests/accelerators/test_accelerator_connector.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update tests/accelerators/test_cpu.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pre commit complaint * update typing to long ugly path * spacing in flow diagram * remove todo comments * docformatter * Update pytorch_lightning/plugins/training_type/training_type_plugin.py * revert test changes * improve custom plugin examples * remove redundant call to ttp attribute it is no longer a property * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Apply suggestions from code review Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-12-16 04:41:34 +00:00			`accelerator = MyAccelerator()`
			`precision_plugin = MyPrecisionPlugin()`
			`training_type_plugin = CustomDDPPlugin(accelerator=accelerator, precision_plugin=precision_plugin)`
			`trainer = Trainer(strategy=training_type_plugin)`
Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00

			`The full list of built-in plugins is listed below.`


			`.. warning:: The Plugin API is in beta and subject to change.`
			`For help setting up custom plugins/accelerators, please reach out to us at support@pytorchlightning.ai`


			`----------`


			`Training Type Plugins`
			`---------------------`

			`.. currentmodule:: pytorch_lightning.plugins.training_type`

			`.. autosummary::`
			`:nosignatures:`
			`:template: classtemplate.rst`

			`TrainingTypePlugin`
			`SingleDevicePlugin`
			`ParallelPlugin`
			`DataParallelPlugin`
			`DDPPlugin`
			`DDP2Plugin`
			`DDPShardedPlugin`
			`DDPSpawnShardedPlugin`
			`DDPSpawnPlugin`
			`DeepSpeedPlugin`
			`HorovodPlugin`
			`SingleTPUPlugin`
			`TPUSpawnPlugin`


			`Precision Plugins`
			`-----------------`

			`.. currentmodule:: pytorch_lightning.plugins.precision`

			`.. autosummary::`
			`:nosignatures:`
			`:template: classtemplate.rst`

			`PrecisionPlugin`
Add `TPUPrecisionPlugin` (#10020) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> 2021-10-19 17:48:57 +00:00			`MixedPrecisionPlugin`
Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00			`NativeMixedPrecisionPlugin`
			`ShardedNativeMixedPrecisionPlugin`
			`ApexMixedPrecisionPlugin`
			`DeepSpeedPrecisionPlugin`
Add `TPUPrecisionPlugin` (#10020) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> 2021-10-19 17:48:57 +00:00			`TPUPrecisionPlugin`
Rename `TPUHalfPrecisionPlugin` to `TPUBf16PrecisionPlugin` (#10026) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> 2021-10-19 21:09:37 +00:00			`TPUBf16PrecisionPlugin`
Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00			`DoublePrecisionPlugin`
Add `TPUPrecisionPlugin` (#10020) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> 2021-10-19 17:48:57 +00:00			`FullyShardedNativeMixedPrecisionPlugin`
			`IPUPrecisionPlugin`
Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00

			`Cluster Environments`
			`--------------------`

			`.. currentmodule:: pytorch_lightning.plugins.environments`

			`.. autosummary::`
			`:nosignatures:`
			`:template: classtemplate.rst`

			`ClusterEnvironment`
			`LightningEnvironment`
Add LSF support (#5102) * add ClusterEnvironment for LSF systems * update init file * add available cluster environments * clean up LSFEnvironment * add ddp_hpc as a distributed backend * clean up SLURMEnvironment * remove extra blank line * init device for DDPHPCAccelerator We need to do this so we don't send the model to the same device from multiple ranks * committing current state * add additional methods to ClusterEnvironments * add NVIDIA mixin for setting up CUDA envars * remove troubleshooting prints * cleanup SLURMEnvironment * fix docstring * cleanup TorchElasticEnvironment and add documentation * PEP8 puts a cork in it * add set_ranks_to_trainer * remove unused import * move to new location * update LSF environment * remove mixin * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * changelog * reset slurm env * add tests * add licence * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * test node_rank * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add lsf env to docs * add auto detection for lsf environment * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix is_using_lsf() and test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> 2021-07-09 14:14:26 +00:00			`LSFEnvironment`
Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00			`TorchElasticEnvironment`
Add kubeflow cluster environment (#7300) * Add kubeflow cluster environment * Add KubeflowEnvironment to docs * Add KubeflowEnvironment to the changelog * break up a long line * Add method to detect kubeflow environment * Select Kubeflow environment when available * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Run pre-commit * task_idx == 0 Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> 2021-05-17 08:05:24 +00:00			`KubeflowEnvironment`
Plugin Docs (#6952) Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> 2021-04-14 20:53:21 +00:00			`SLURMEnvironment`