lightning/docs/source/common/hyperparameters.rst

.. testsetup:: *

    import torch
    from argparse import ArgumentParser, Namespace
    from pytorch_lightning.trainer.trainer import Trainer
    from pytorch_lightning.core.lightning import LightningModule
    import sys

    sys.argv = ["foo"]

Hyperparameters
---------------
Lightning has utilities to interact seamlessly with the command line ``ArgumentParser``
and plays well with the hyperparameter optimization framework of your choice.

----------

ArgumentParser
^^^^^^^^^^^^^^
Lightning is designed to augment a lot of the functionality of the built-in Python ArgumentParser

.. testcode::

    from argparse import ArgumentParser

    parser = ArgumentParser()
    parser.add_argument("--layer_1_dim", type=int, default=128)
    args = parser.parse_args()

This allows you to call your program like so:

.. code-block:: bash

    python trainer.py --layer_1_dim 64

----------

Argparser Best Practices
^^^^^^^^^^^^^^^^^^^^^^^^
It is best practice to layer your arguments in three sections.

1.  Trainer args (``gpus``, ``num_nodes``, etc...)
2.  Model specific arguments (``layer_dim``, ``num_layers``, ``learning_rate``, etc...)
3.  Program arguments (``data_path``, ``cluster_email``, etc...)

|

We can do this as follows. First, in your ``LightningModule``, define the arguments
specific to that module. Remember that data splits or data paths may also be specific to
a module (i.e.: if your project has a model that trains on Imagenet and another on CIFAR-10).

.. testcode::

    class LitModel(LightningModule):
        @staticmethod
        def add_model_specific_args(parent_parser):
            parser = parent_parser.add_argument_group("LitModel")
            parser.add_argument("--encoder_layers", type=int, default=12)
            parser.add_argument("--data_path", type=str, default="/some/path")
            return parent_parser

Now in your main trainer file, add the ``Trainer`` args, the program args, and add the model args

.. testcode::

    # ----------------
    # trainer_main.py
    # ----------------
    from argparse import ArgumentParser

    parser = ArgumentParser()

    # add PROGRAM level args
    parser.add_argument("--conda_env", type=str, default="some_name")
    parser.add_argument("--notification_email", type=str, default="will@email.com")

    # add model specific args
    parser = LitModel.add_model_specific_args(parser)

    # add all the available trainer options to argparse
    # ie: now --gpus --num_nodes ... --fast_dev_run all work in the cli
    parser = Trainer.add_argparse_args(parser)

    args = parser.parse_args()

Now you can call run your program like so:

.. code-block:: bash

    python trainer_main.py --gpus 2 --num_nodes 2 --conda_env 'my_env' --encoder_layers 12

Finally, make sure to start the training like so:

.. code-block:: python

    # init the trainer like this
    trainer = Trainer.from_argparse_args(args, early_stopping_callback=...)

    # NOT like this
    trainer = Trainer(gpus=hparams.gpus, ...)

    # init the model with Namespace directly
    model = LitModel(args)

    # or init the model with all the key-value pairs
    dict_args = vars(args)
    model = LitModel(**dict_args)

----------

LightningModule hyperparameters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Often times we train many versions of a model. You might share that model or come back to it a few months later
at which point it is very useful to know how that model was trained (i.e.: what learning rate, neural network, etc...).

Lightning has a few ways of saving that information for you in checkpoints and yaml files. The goal here is to
improve readability and reproducibility.

1.  Using :meth:`~pytorch_lightning.core.lightning.LightningModule. save_hyperparameters` within your
    :class:`~pytorch_lightning.core.lightning.LightningModule` ``__init__`` function will enable Lightning
    to store all the provided arguments within the ``self.hparams`` attribute. These hyper-parameters will
    also be stored within the model checkpoint, which simplifies model re-instantiation in production settings.
    This also makes those values available via ``self.hparams``.

    .. code-block:: python

        class LitMNIST(LightningModule):
            def __init__(self, layer_1_dim=128, learning_rate=1e-2, **kwargs):
                super().__init__()
                # call this to save (layer_1_dim=128, learning_rate=1e-4) to the checkpoint
                self.save_hyperparameters()

                # equivalent
                self.save_hyperparameters("layer_1_dim", "learning_rate")

                # Now possible to access layer_1_dim from hparams
                self.hparams.layer_1_dim


2.  Sometimes your init might have objects or other parameters you might not want to save.
    In that case, choose only a few

    .. code-block:: python

        class LitMNIST(LightningModule):
            def __init__(self, loss_fx, generator_network, layer_1_dim=128 ** kwargs):
                super().__init__()
                self.layer_1_dim = layer_1_dim
                self.loss_fx = loss_fx

                # call this to save (layer_1_dim=128) to the checkpoint
                self.save_hyperparameters("layer_1_dim")


        # to load specify the other args
        model = LitMNIST.load_from_checkpoint(PATH, loss_fx=torch.nn.SomeOtherLoss, generator_network=MyGenerator())


3.  You can also convert full objects such as ``dict`` or ``Namespace`` to ``hparams`` so they get saved to the
    checkpoint.

    .. code-block:: python

        class LitMNIST(LightningModule):
            def __init__(self, conf: Optional[Union[Dict, Namespace, DictConfig]] = None, **kwargs):
                super().__init__()
                # save the config and any extra arguments
                self.save_hyperparameters(conf)
                self.save_hyperparameters(kwargs)

                self.layer_1 = nn.Linear(28 * 28, self.hparams.layer_1_dim)
                self.layer_2 = nn.Linear(self.hparams.layer_1_dim, self.hparams.layer_2_dim)
                self.layer_3 = nn.Linear(self.hparams.layer_2_dim, 10)


        conf = {...}
        # OR
        # conf = parser.parse_args()
        # OR
        # conf = OmegaConf.create(...)
        model = LitMNIST(conf=conf, anything=10)

        # Now possible to access any stored variables from hparams
        model.hparams.anything

        # for this to work, you need to access with `self.hparams.layer_1_dim`, not `conf.layer_1_dim`
        model = LitMNIST.load_from_checkpoint(PATH)

----------

Trainer args
^^^^^^^^^^^^
To recap, add ALL possible trainer flags to the argparser and init the ``Trainer`` this way

.. code-block:: python

    parser = ArgumentParser()
    parser = Trainer.add_argparse_args(parser)
    hparams = parser.parse_args()

    trainer = Trainer.from_argparse_args(hparams)

    # or if you need to pass in callbacks
    trainer = Trainer.from_argparse_args(hparams, enable_checkpointing=..., callbacks=[...])

----------

Multiple Lightning Modules
^^^^^^^^^^^^^^^^^^^^^^^^^^

We often have multiple Lightning Modules where each one has different arguments. Instead of
polluting the ``main.py`` file, the ``LightningModule`` lets you define arguments for each one.

.. testcode::

    class LitMNIST(LightningModule):
        def __init__(self, layer_1_dim, **kwargs):
            super().__init__()
            self.layer_1 = nn.Linear(28 * 28, layer_1_dim)

        @staticmethod
        def add_model_specific_args(parent_parser):
            parser = parent_parser.add_argument_group("LitMNIST")
            parser.add_argument("--layer_1_dim", type=int, default=128)
            return parent_parser

.. testcode::

    class GoodGAN(LightningModule):
        def __init__(self, encoder_layers, **kwargs):
            super().__init__()
            self.encoder = Encoder(layers=encoder_layers)

        @staticmethod
        def add_model_specific_args(parent_parser):
            parser = parent_parser.add_argument_group("GoodGAN")
            parser.add_argument("--encoder_layers", type=int, default=12)
            return parent_parser


Now we can allow each model to inject the arguments it needs in the ``main.py``

.. code-block:: python

    def main(args):
        dict_args = vars(args)

        # pick model
        if args.model_name == "gan":
            model = GoodGAN(**dict_args)
        elif args.model_name == "mnist":
            model = LitMNIST(**dict_args)

        trainer = Trainer.from_argparse_args(args)
        trainer.fit(model)


    if __name__ == "__main__":
        parser = ArgumentParser()
        parser = Trainer.add_argparse_args(parser)

        # figure out which model to use
        parser.add_argument("--model_name", type=str, default="gan", help="gan or mnist")

        # THIS LINE IS KEY TO PULL THE MODEL NAME
        temp_args, _ = parser.parse_known_args()

        # let the model add what it wants
        if temp_args.model_name == "gan":
            parser = GoodGAN.add_model_specific_args(parser)
        elif temp_args.model_name == "mnist":
            parser = LitMNIST.add_model_specific_args(parser)

        args = parser.parse_args()

        # train
        main(args)

and now we can train MNIST or the GAN using the command line interface!

.. code-block:: bash

    $ python main.py --model_name gan --encoder_layers 24
    $ python main.py --model_name mnist --layer_1_dim 128