Update docs (#1656)
* edit doc mentioned in #646 * edit doc * underline * class reference Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
This commit is contained in:
parent
79196246cf
commit
9b86aea98b
|
@ -42,45 +42,26 @@ Must use an int if using an IterableDataset.
|
|||
# check every 100 train batches (ie: for IterableDatasets or fixed frequency)
|
||||
trainer = Trainer(val_check_interval=100)
|
||||
|
||||
Use training data subset
|
||||
------------------------
|
||||
If you don't want to check 100% of the training set (for debugging or if it's huge), set this flag.
|
||||
Use data subset for training, validation and test
|
||||
-------------------------------------------------
|
||||
If you don't want to check 100% of the training/validation/test set (for debugging or if it's huge), set these flags.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# DEFAULT
|
||||
trainer = Trainer(train_percent_check=1.0)
|
||||
trainer = Trainer(
|
||||
train_percent_check=1.0,
|
||||
val_percent_check=1.0,
|
||||
test_percent_check=1.0
|
||||
)
|
||||
|
||||
# check 10% only
|
||||
trainer = Trainer(train_percent_check=0.1)
|
||||
# check 10%, 20%, 30% only, respectively for training, validation and test set
|
||||
trainer = Trainer(
|
||||
train_percent_check=0.1,
|
||||
val_percent_check=0.2,
|
||||
test_percent_check=0.3
|
||||
)
|
||||
|
||||
.. note:: ``train_percent_check`` will be overwritten by ``overfit_pct`` if ``overfit_pct`` > 0.
|
||||
.. note:: ``train_percent_check``, ``val_percent_check`` and ``test_percent_check`` will be overwritten by ``overfit_pct`` if ``overfit_pct`` > 0. ``val_percent_check`` will be ignored if ``fast_dev_run=True``.
|
||||
|
||||
Use test data subset
|
||||
--------------------
|
||||
If you don't want to check 100% of the test set (for debugging or if it's huge), set this flag.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# DEFAULT
|
||||
trainer = Trainer(test_percent_check=1.0)
|
||||
|
||||
# check 10% only
|
||||
trainer = Trainer(test_percent_check=0.1)
|
||||
|
||||
.. note:: ``test_percent_check`` will be overwritten by ``overfit_pct`` if ``overfit_pct`` > 0.
|
||||
|
||||
Use validation data subset
|
||||
--------------------------
|
||||
If you don't want to check 100% of the validation set (for debugging or if it's huge), set this flag.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# DEFAULT
|
||||
trainer = Trainer(val_percent_check=1.0)
|
||||
|
||||
# check 10% only
|
||||
trainer = Trainer(val_percent_check=0.1)
|
||||
|
||||
.. note:: ``val_percent_check`` will be overwritten by ``overfit_pct`` if ``overfit_pct`` > 0 and ignored if
|
||||
``fast_dev_run=True``.
|
||||
.. note:: If you set ``val_percent_check=0``, validation will be disabled.
|
||||
|
|
|
@ -11,17 +11,14 @@ To train a model using multiple-nodes do the following:
|
|||
|
||||
1. Design your LightningModule.
|
||||
|
||||
2. Add `torch.DistributedSampler <https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler>`_
|
||||
which enables access to a subset of your full dataset to each GPU.
|
||||
|
||||
3. Enable ddp in the trainer
|
||||
2. Enable ddp in the trainer
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# train on 32 GPUs across 4 nodes
|
||||
trainer = Trainer(gpus=8, num_nodes=4, distributed_backend='ddp')
|
||||
|
||||
4. It's a good idea to structure your train.py file like this:
|
||||
3. It's a good idea to structure your train.py file like this:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
@ -91,6 +88,8 @@ To train a model using multiple-nodes do the following:
|
|||
|
||||
sbatch submit.sh
|
||||
|
||||
.. note:: using :class:`~torch.utils.data.distributed.DistributedSampler` is already handled by Lightning.
|
||||
|
||||
Walltime auto-resubmit
|
||||
-----------------------------------
|
||||
When you use Lightning in a SLURM cluster, lightning automatically detects when it is about
|
||||
|
|
Loading…
Reference in New Issue