lightning/docs/source/optimizers.rst

Optimization
===============

Learning rate scheduling
-------------------------------------
Every optimizer you use can be paired with any `LearningRateScheduler <https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate>`_.

.. code-block:: python

   # no LR scheduler
   def configure_optimizers(self):
      return Adam(...)

   # Adam + LR scheduler
   def configure_optimizers(self):
      return [Adam(...)], [ReduceLROnPlateau()]

   # Two optimziers each with a scheduler
   def configure_optimizers(self):
      return [Adam(...), SGD(...)], [ReduceLROnPlateau(), LambdaLR()]


Use multiple optimizers (like GANs)
-------------------------------------
To use multiple optimizers return > 1 optimizers from :meth:`pytorch_lightning.core.LightningModule.configure_optimizers`

.. code-block:: python

   # one optimizer
   def configure_optimizers(self):
      return Adam(...)

   # two optimizers, no schedulers
   def configure_optimizers(self):
      return Adam(...), SGD(...)

   # Two optimizers, one scheduler for adam only
   def configure_optimizers(self):
      return [Adam(...), SGD(...)], [ReduceLROnPlateau()]

Lightning will call each optimizer sequentially:

.. code-block:: python

   for epoch in epochs:
      for batch in data:
         for opt in optimizers:
            train_step(opt)
            opt.step()

      for scheduler in scheduler:
         scheduler.step()


Step optimizers at arbitrary intervals
----------------------------------------
To do more interesting things with your optimizers such as learning rate warm-up or odd scheduling,
override the :meth:`optimizer_step` function.

For example, here step optimizer A every 2 batches and optimizer B every 4 batches

.. code-block:: python

    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_i, second_order_closure=None):
        optimizer.step()
        optimizer.zero_grad()

    # Alternating schedule for optimizer steps (ie: GANs)
    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_i, second_order_closure=None):
        # update generator opt every 2 steps
        if optimizer_i == 0:
            if batch_nb % 2 == 0 :
                optimizer.step()
                optimizer.zero_grad()

        # update discriminator opt every 4 steps
        if optimizer_i == 1:
            if batch_nb % 4 == 0 :
                optimizer.step()
                optimizer.zero_grad()

        # ...
        # add as many optimizers as you want

Here we add a learning-rate warm up

.. code-block:: python

    # learning rate warm-up
    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_i, second_order_closure=None):
        # warm up lr
        if self.trainer.global_step < 500:
            lr_scale = min(1., float(self.trainer.global_step + 1) / 500.)
            for pg in optimizer.param_groups:
                pg['lr'] = lr_scale * self.hparams.learning_rate

        # update params
        optimizer.step()
        optimizer.zero_grad()
Docs (#813) * added outline of all features * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated docs 2020-02-11 04:55:22 +00:00			`Optimization`
			`===============`

			`Learning rate scheduling`
			`-------------------------------------`
			Every optimizer you use can be paired with any `LearningRateScheduler <https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate>`_.

			`.. code-block:: python`

			`# no LR scheduler`
			`def configure_optimizers(self):`
			`return Adam(...)`

			`# Adam + LR scheduler`
			`def configure_optimizers(self):`
			`return [Adam(...)], [ReduceLROnPlateau()]`

			`# Two optimziers each with a scheduler`
			`def configure_optimizers(self):`
			`return [Adam(...), SGD(...)], [ReduceLROnPlateau(), LambdaLR()]`


			`Use multiple optimizers (like GANs)`
			`-------------------------------------`
			To use multiple optimizers return > 1 optimizers from :meth:`pytorch_lightning.core.LightningModule.configure_optimizers`

			`.. code-block:: python`

			`# one optimizer`
			`def configure_optimizers(self):`
			`return Adam(...)`

			`# two optimizers, no schedulers`
			`def configure_optimizers(self):`
			`return Adam(...), SGD(...)`

			`# Two optimizers, one scheduler for adam only`
			`def configure_optimizers(self):`
			`return [Adam(...), SGD(...)], [ReduceLROnPlateau()]`

			`Lightning will call each optimizer sequentially:`

			`.. code-block:: python`

			`for epoch in epochs:`
			`for batch in data:`
			`for opt in optimizers:`
			`train_step(opt)`
			`opt.step()`

			`for scheduler in scheduler:`
			`scheduler.step()`


			`Step optimizers at arbitrary intervals`
resolving documentation warnings (#833) * add more underline * fix LightningMudule import error * remove unneeded blank line * escape asterisk to fix inline emphasis warning * add PULL_REQUEST_TEMPLATE.md * add __init__.py and import imagenet_example * fix duplicate label * add noindex option to fix duplicate object warnings * remove unexpected indent * refer explicit LightningModule * fix minor bug * refer EarlyStopping explicitly * restore exclude patterns * change the way how to refer class * remove unused import * update badges & drop Travis/Appveyor (#826) * drop Travis * drop Appveyor * update badges * fix missing PyPI images & CI badges (#853) * docs - anchor links (#848) * docs - add links * add desc. * add Greeting action (#843) * add Greeting action * Update greetings.yml Co-authored-by: William Falcon <waf2107@columbia.edu> * add pep8speaks (#842) * advanced profiler describe + cleaned up tests (#837) * add py36 compatibility * add test case to capture previous bug * clean up tests * clean up tests * Update lightning_module_template.py * Update lightning.py * respond lint issues * break long line * break more lines * checkout conflicting files from master * shorten url * checkout from upstream/master * remove trailing whitespaces * remove unused import LightningModule * fix sphinx bot warnings * Apply suggestions from code review just to trigger CI * Update .github/workflows/greetings.yml Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> 2020-02-27 21:07:51 +00:00			`----------------------------------------`
Docs (#813) * added outline of all features * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated docs 2020-02-11 04:55:22 +00:00			`To do more interesting things with your optimizers such as learning rate warm-up or odd scheduling,`
resolving documentation warnings (#833) * add more underline * fix LightningMudule import error * remove unneeded blank line * escape asterisk to fix inline emphasis warning * add PULL_REQUEST_TEMPLATE.md * add __init__.py and import imagenet_example * fix duplicate label * add noindex option to fix duplicate object warnings * remove unexpected indent * refer explicit LightningModule * fix minor bug * refer EarlyStopping explicitly * restore exclude patterns * change the way how to refer class * remove unused import * update badges & drop Travis/Appveyor (#826) * drop Travis * drop Appveyor * update badges * fix missing PyPI images & CI badges (#853) * docs - anchor links (#848) * docs - add links * add desc. * add Greeting action (#843) * add Greeting action * Update greetings.yml Co-authored-by: William Falcon <waf2107@columbia.edu> * add pep8speaks (#842) * advanced profiler describe + cleaned up tests (#837) * add py36 compatibility * add test case to capture previous bug * clean up tests * clean up tests * Update lightning_module_template.py * Update lightning.py * respond lint issues * break long line * break more lines * checkout conflicting files from master * shorten url * checkout from upstream/master * remove trailing whitespaces * remove unused import LightningModule * fix sphinx bot warnings * Apply suggestions from code review just to trigger CI * Update .github/workflows/greetings.yml Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> 2020-02-27 21:07:51 +00:00			override the :meth:`optimizer_step` function.
Docs (#813) * added outline of all features * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated common use cases doc * updated docs 2020-02-11 04:55:22 +00:00
			`For example, here step optimizer A every 2 batches and optimizer B every 4 batches`

			`.. code-block:: python`

			`def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_i, second_order_closure=None):`
			`optimizer.step()`
			`optimizer.zero_grad()`

			`# Alternating schedule for optimizer steps (ie: GANs)`
			`def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_i, second_order_closure=None):`
			`# update generator opt every 2 steps`
			`if optimizer_i == 0:`
			`if batch_nb % 2 == 0 :`
			`optimizer.step()`
			`optimizer.zero_grad()`

			`# update discriminator opt every 4 steps`
			`if optimizer_i == 1:`
			`if batch_nb % 4 == 0 :`
			`optimizer.step()`
			`optimizer.zero_grad()`

			`# ...`
			`# add as many optimizers as you want`

			`Here we add a learning-rate warm up`

			`.. code-block:: python`

			`# learning rate warm-up`
			`def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_i, second_order_closure=None):`
			`# warm up lr`
			`if self.trainer.global_step < 500:`
			`lr_scale = min(1., float(self.trainer.global_step + 1) / 500.)`
			`for pg in optimizer.param_groups:`
			`pg['lr'] = lr_scale * self.hparams.learning_rate`

			`# update params`
			`optimizer.step()`
resolving documentation warnings (#833) * add more underline * fix LightningMudule import error * remove unneeded blank line * escape asterisk to fix inline emphasis warning * add PULL_REQUEST_TEMPLATE.md * add __init__.py and import imagenet_example * fix duplicate label * add noindex option to fix duplicate object warnings * remove unexpected indent * refer explicit LightningModule * fix minor bug * refer EarlyStopping explicitly * restore exclude patterns * change the way how to refer class * remove unused import * update badges & drop Travis/Appveyor (#826) * drop Travis * drop Appveyor * update badges * fix missing PyPI images & CI badges (#853) * docs - anchor links (#848) * docs - add links * add desc. * add Greeting action (#843) * add Greeting action * Update greetings.yml Co-authored-by: William Falcon <waf2107@columbia.edu> * add pep8speaks (#842) * advanced profiler describe + cleaned up tests (#837) * add py36 compatibility * add test case to capture previous bug * clean up tests * clean up tests * Update lightning_module_template.py * Update lightning.py * respond lint issues * break long line * break more lines * checkout conflicting files from master * shorten url * checkout from upstream/master * remove trailing whitespaces * remove unused import LightningModule * fix sphinx bot warnings * Apply suggestions from code review just to trigger CI * Update .github/workflows/greetings.yml Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> 2020-02-27 21:07:51 +00:00			`optimizer.zero_grad()`