2020-05-05 02:16:54 +00:00
|
|
|
.. testsetup:: *
|
|
|
|
|
|
|
|
from pytorch_lightning.trainer.trainer import Trainer
|
|
|
|
from pytorch_lightning.core.lightning import LightningModule
|
2020-08-13 22:56:51 +00:00
|
|
|
|
|
|
|
.. _lr_finder:
|
2020-05-05 02:16:54 +00:00
|
|
|
|
2020-04-10 18:34:23 +00:00
|
|
|
Learning Rate Finder
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
For training deep neural networks, selecting a good learning rate is essential
|
|
|
|
for both better performance and faster convergence. Even optimizers such as
|
|
|
|
`Adam` that are self-adjusting the learning rate can benefit from more optimal
|
|
|
|
choices.
|
|
|
|
|
|
|
|
To reduce the amount of guesswork concerning choosing a good initial learning
|
|
|
|
rate, a `learning rate finder` can be used. As described in this `paper <https://arxiv.org/abs/1506.01186>`_
|
|
|
|
a learning rate finder does a small run where the learning rate is increased
|
|
|
|
after each processed batch and the corresponding loss is logged. The result of
|
2020-05-07 13:25:54 +00:00
|
|
|
this is a `lr` vs. `loss` plot that can be used as guidance for choosing a optimal
|
2020-04-10 18:34:23 +00:00
|
|
|
initial lr.
|
|
|
|
|
2020-06-13 02:51:00 +00:00
|
|
|
.. warning::
|
|
|
|
For the moment, this feature only works with models having a single optimizer.
|
2020-09-29 12:28:14 +00:00
|
|
|
LR Finder support for DDP is not implemented yet, it is coming soon.
|
2020-04-10 18:34:23 +00:00
|
|
|
|
2020-06-19 06:38:10 +00:00
|
|
|
----------
|
|
|
|
|
2020-05-17 14:59:46 +00:00
|
|
|
Using Lightning's built-in LR finder
|
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
2020-04-10 18:34:23 +00:00
|
|
|
|
|
|
|
In the most basic use case, this feature can be enabled during trainer construction
|
2020-05-17 14:59:46 +00:00
|
|
|
with ``Trainer(auto_lr_find=True)``. When ``.fit(model)`` is called, the LR finder
|
2020-06-13 02:51:00 +00:00
|
|
|
will automatically run before any training is done. The ``lr`` that is found
|
2020-04-10 18:34:23 +00:00
|
|
|
and used will be written to the console and logged together with all other
|
|
|
|
hyperparameters of the model.
|
|
|
|
|
2020-05-05 02:16:54 +00:00
|
|
|
.. testcode::
|
2020-04-10 18:34:23 +00:00
|
|
|
|
2020-05-17 14:59:46 +00:00
|
|
|
# default: no automatic learning rate finder
|
|
|
|
trainer = Trainer(auto_lr_find=False)
|
2020-04-10 18:34:23 +00:00
|
|
|
|
2020-05-24 22:59:08 +00:00
|
|
|
This flag sets your learning rate which can be accessed via ``self.lr`` or ``self.learning_rate``.
|
|
|
|
|
2020-05-05 02:16:54 +00:00
|
|
|
.. testcode::
|
2020-04-10 18:34:23 +00:00
|
|
|
|
|
|
|
class LitModel(LightningModule):
|
2020-05-05 02:16:54 +00:00
|
|
|
|
2020-05-24 22:59:08 +00:00
|
|
|
def __init__(self, learning_rate):
|
|
|
|
self.learning_rate = learning_rate
|
2020-04-10 18:34:23 +00:00
|
|
|
|
|
|
|
def configure_optimizers(self):
|
2020-05-24 22:59:08 +00:00
|
|
|
return Adam(self.parameters(), lr=(self.lr or self.learning_rate))
|
2020-04-10 18:34:23 +00:00
|
|
|
|
|
|
|
# finds learning rate automatically
|
|
|
|
# sets hparams.lr or hparams.learning_rate to that learning rate
|
2020-05-05 02:16:54 +00:00
|
|
|
trainer = Trainer(auto_lr_find=True)
|
2020-04-10 18:34:23 +00:00
|
|
|
|
2020-06-13 02:51:00 +00:00
|
|
|
To use an arbitrary value set it as auto_lr_find
|
2020-04-10 18:34:23 +00:00
|
|
|
|
2020-05-05 02:16:54 +00:00
|
|
|
.. testcode::
|
2020-04-10 18:34:23 +00:00
|
|
|
|
|
|
|
# to set to your own hparams.my_value
|
2020-05-05 02:16:54 +00:00
|
|
|
trainer = Trainer(auto_lr_find='my_value')
|
2020-04-10 18:34:23 +00:00
|
|
|
|
2020-06-13 02:51:00 +00:00
|
|
|
Under the hood, when you call fit it runs the learning rate finder before actually calling fit.
|
2020-04-10 18:34:23 +00:00
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
# when you call .fit() this happens
|
|
|
|
# 1. find learning rate
|
|
|
|
# 2. actually run fit
|
|
|
|
trainer.fit(model)
|
|
|
|
|
|
|
|
If you want to inspect the results of the learning rate finder before doing any
|
|
|
|
actual training or just play around with the parameters of the algorithm, this
|
|
|
|
can be done by invoking the ``lr_find`` method of the trainer. A typical example
|
|
|
|
of this would look like
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
model = MyModelClass(hparams)
|
2020-05-05 02:16:54 +00:00
|
|
|
trainer = Trainer()
|
2020-04-10 18:34:23 +00:00
|
|
|
|
|
|
|
# Run learning rate finder
|
2020-09-10 02:12:27 +00:00
|
|
|
lr_finder = trainer.tuner.lr_find(model)
|
2020-04-10 18:34:23 +00:00
|
|
|
|
|
|
|
# Results can be found in
|
|
|
|
lr_finder.results
|
|
|
|
|
|
|
|
# Plot with
|
|
|
|
fig = lr_finder.plot(suggest=True)
|
|
|
|
fig.show()
|
|
|
|
|
|
|
|
# Pick point based on plot, or get suggestion
|
|
|
|
new_lr = lr_finder.suggestion()
|
|
|
|
|
|
|
|
# update hparams of the model
|
|
|
|
model.hparams.lr = new_lr
|
2020-05-24 22:59:08 +00:00
|
|
|
|
2020-04-10 18:34:23 +00:00
|
|
|
# Fit model
|
|
|
|
trainer.fit(model)
|
|
|
|
|
|
|
|
The figure produced by ``lr_finder.plot()`` should look something like the figure
|
|
|
|
below. It is recommended to not pick the learning rate that achives the lowest
|
|
|
|
loss, but instead something in the middle of the sharpest downward slope (red point).
|
|
|
|
This is the point returned py ``lr_finder.suggestion()``.
|
|
|
|
|
|
|
|
.. figure:: /_images/trainer/lr_finder.png
|
|
|
|
|
|
|
|
The parameters of the algorithm can be seen below.
|
|
|
|
|
2020-09-10 02:12:27 +00:00
|
|
|
.. autofunction:: pytorch_lightning.tuner.lr_finder.lr_find
|
2020-04-10 18:34:23 +00:00
|
|
|
:noindex:
|