Update docs (#1656)

* edit doc

mentioned in #646

* edit doc

* underline

* class reference

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
This commit is contained in:
Jacob Zhong 2020-04-29 08:55:06 -04:00 committed by GitHub
parent 79196246cf
commit 9b86aea98b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 20 additions and 40 deletions

View File

@ -42,45 +42,26 @@ Must use an int if using an IterableDataset.
# check every 100 train batches (ie: for IterableDatasets or fixed frequency)
trainer = Trainer(val_check_interval=100)
Use training data subset
------------------------
If you don't want to check 100% of the training set (for debugging or if it's huge), set this flag.
Use data subset for training, validation and test
-------------------------------------------------
If you don't want to check 100% of the training/validation/test set (for debugging or if it's huge), set these flags.
.. code-block:: python
# DEFAULT
trainer = Trainer(train_percent_check=1.0)
trainer = Trainer(
train_percent_check=1.0,
val_percent_check=1.0,
test_percent_check=1.0
)
# check 10% only
trainer = Trainer(train_percent_check=0.1)
# check 10%, 20%, 30% only, respectively for training, validation and test set
trainer = Trainer(
train_percent_check=0.1,
val_percent_check=0.2,
test_percent_check=0.3
)
.. note:: ``train_percent_check`` will be overwritten by ``overfit_pct`` if ``overfit_pct`` > 0.
.. note:: ``train_percent_check``, ``val_percent_check`` and ``test_percent_check`` will be overwritten by ``overfit_pct`` if ``overfit_pct`` > 0. ``val_percent_check`` will be ignored if ``fast_dev_run=True``.
Use test data subset
--------------------
If you don't want to check 100% of the test set (for debugging or if it's huge), set this flag.
.. code-block:: python
# DEFAULT
trainer = Trainer(test_percent_check=1.0)
# check 10% only
trainer = Trainer(test_percent_check=0.1)
.. note:: ``test_percent_check`` will be overwritten by ``overfit_pct`` if ``overfit_pct`` > 0.
Use validation data subset
--------------------------
If you don't want to check 100% of the validation set (for debugging or if it's huge), set this flag.
.. code-block:: python
# DEFAULT
trainer = Trainer(val_percent_check=1.0)
# check 10% only
trainer = Trainer(val_percent_check=0.1)
.. note:: ``val_percent_check`` will be overwritten by ``overfit_pct`` if ``overfit_pct`` > 0 and ignored if
``fast_dev_run=True``.
.. note:: If you set ``val_percent_check=0``, validation will be disabled.

View File

@ -11,17 +11,14 @@ To train a model using multiple-nodes do the following:
1. Design your LightningModule.
2. Add `torch.DistributedSampler <https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler>`_
which enables access to a subset of your full dataset to each GPU.
3. Enable ddp in the trainer
2. Enable ddp in the trainer
.. code-block:: python
# train on 32 GPUs across 4 nodes
trainer = Trainer(gpus=8, num_nodes=4, distributed_backend='ddp')
4. It's a good idea to structure your train.py file like this:
3. It's a good idea to structure your train.py file like this:
.. code-block:: python
@ -91,6 +88,8 @@ To train a model using multiple-nodes do the following:
sbatch submit.sh
.. note:: using :class:`~torch.utils.data.distributed.DistributedSampler` is already handled by Lightning.
Walltime auto-resubmit
-----------------------------------
When you use Lightning in a SLURM cluster, lightning automatically detects when it is about