2020-05-05 02:16:54 +00:00
.. testsetup :: *
from pytorch_lightning.trainer.trainer import Trainer
2020-08-13 22:56:51 +00:00
.. _fast-training:
2020-05-05 02:16:54 +00:00
2020-02-11 04:55:22 +00:00
Fast Training
2020-03-27 02:07:22 +00:00
=============
2020-02-11 04:55:22 +00:00
There are multiple options to speed up different parts of the training by choosing to train
on a subset of data. This could be done for speed or debugging purposes.
2020-06-18 21:54:29 +00:00
----------------
2020-06-17 17:42:28 +00:00
2020-02-11 04:55:22 +00:00
Check validation every n epochs
2020-03-27 02:07:22 +00:00
-------------------------------
2020-02-11 04:55:22 +00:00
If you have a small dataset you might want to check validation every n epochs
2020-05-05 02:16:54 +00:00
.. testcode ::
2020-02-11 04:55:22 +00:00
# DEFAULT
trainer = Trainer(check_val_every_n_epoch=1)
2020-06-18 21:54:29 +00:00
----------------
2020-06-17 17:42:28 +00:00
2020-02-11 04:55:22 +00:00
Force training for min or max epochs
2020-03-27 02:07:22 +00:00
------------------------------------
2020-02-11 04:55:22 +00:00
It can be useful to force training for a minimum number of epochs or limit to a max number.
2020-03-20 19:49:01 +00:00
.. seealso ::
:class: `~pytorch_lightning.trainer.trainer.Trainer`
2020-02-11 04:55:22 +00:00
2020-05-05 02:16:54 +00:00
.. testcode ::
2020-02-11 04:55:22 +00:00
# DEFAULT
2020-02-17 20:47:07 +00:00
trainer = Trainer(min_epochs=1, max_epochs=1000)
2020-02-11 04:55:22 +00:00
2020-06-18 21:54:29 +00:00
----------------
2020-02-11 04:55:22 +00:00
Set validation check frequency within 1 training epoch
2020-03-27 02:07:22 +00:00
------------------------------------------------------
2020-02-11 04:55:22 +00:00
For large datasets it's often desirable to check validation multiple times within a training loop.
Pass in a float to check that often within 1 training epoch. Pass in an int k to check every k training batches.
Must use an int if using an IterableDataset.
2020-05-05 02:16:54 +00:00
.. testcode ::
2020-02-11 04:55:22 +00:00
# DEFAULT
trainer = Trainer(val_check_interval=0.95)
# check every .25 of an epoch
trainer = Trainer(val_check_interval=0.25)
# check every 100 train batches (ie: for IterableDatasets or fixed frequency)
trainer = Trainer(val_check_interval=100)
2020-06-18 21:54:29 +00:00
----------------
2020-06-17 17:42:28 +00:00
2020-04-29 12:55:06 +00:00
Use data subset for training, validation and test
-------------------------------------------------
If you don't want to check 100% of the training/validation/test set (for debugging or if it's huge), set these flags.
2020-02-11 04:55:22 +00:00
2020-05-05 02:16:54 +00:00
.. testcode ::
# DEFAULT
trainer = Trainer(
2020-06-17 17:42:28 +00:00
limit_train_batches=1.0,
2020-06-17 12:03:28 +00:00
limit_val_batches=1.0,
limit_test_batches=1.0
2020-05-05 02:16:54 +00:00
)
# check 10%, 20%, 30% only, respectively for training, validation and test set
trainer = Trainer(
2020-06-17 17:42:28 +00:00
limit_train_batches=0.1,
2020-06-17 12:03:28 +00:00
limit_val_batches=0.2,
limit_test_batches=0.3
2020-05-05 02:16:54 +00:00
)
2020-04-29 12:55:06 +00:00
2020-07-22 14:35:33 +00:00
If you also pass `` shuffle=True `` to the dataloader, a different random subset of your dataset will be used for each epoch; otherwise the same subset will be used for all epochs.
2020-07-17 09:54:24 +00:00
2020-06-17 17:42:28 +00:00
.. note :: `` limit_train_batches `` , `` limit_val_batches `` and `` limit_test_batches `` will be overwritten by `` overfit_batches `` if `` overfit_batches `` > 0. `` limit_val_batches `` will be ignored if `` fast_dev_run=True `` .
2020-04-29 12:55:06 +00:00
2020-06-17 12:03:28 +00:00
.. note :: If you set `` limit_val_batches=0 `` , validation will be disabled.