lightning/docs/Trainer/Logging.md

89 lines
2.5 KiB
Markdown
Raw Normal View History

2019-06-27 18:22:00 +00:00
Lighting offers a few options for logging information about model, gpu usage, etc (via test-tube). It also offers printing options for training monitoring.
2019-06-27 17:47:19 +00:00
---
#### Display metrics in progress bar
``` {.python}
# DEFAULT
trainer = Trainer(progress_bar=True)
```
2019-06-27 18:22:00 +00:00
---
#### Log metric row every k batches
Every k batches lightning will make an entry in the metrics log
``` {.python}
# DEFAULT (ie: save a .csv log file every 10 batches)
trainer = Trainer(add_log_row_interval=10)
```
2019-06-27 17:47:19 +00:00
2019-06-27 17:58:13 +00:00
---
#### Process position
When running multiple models on the same machine we want to decide which progress bar to use.
Lightning will stack progress bars according to this value.
``` {.python}
# DEFAULT
trainer = Trainer(process_position=0)
# if this is the second model on the node, show the second progress bar below
trainer = Trainer(process_position=1)
```
2019-06-27 17:47:19 +00:00
---
2019-06-27 17:58:13 +00:00
#### Save a snapshot of all hyperparameters
Whenever you call .save() on the test-tube experiment it logs all the hyperparameters in current use.
Give lightning a test-tube Experiment object to automate this for you.
2019-06-27 18:22:00 +00:00
``` {.python}
2019-08-04 18:05:56 +00:00
from test_tube import Experiment
2019-06-27 18:22:00 +00:00
exp = Experiment(...)
Trainer(experiment=exp)
```
2019-06-27 17:58:13 +00:00
---
2019-06-27 18:22:00 +00:00
#### Snapshot code for a training run
Whenever you call .save() on the test-tube experiment it snapshows all code and pushes to a git tag.
Give lightning a test-tube Experiment object to automate this for you.
2019-06-27 17:47:19 +00:00
``` {.python}
2019-08-04 18:05:56 +00:00
from test_tube import Experiment
2019-06-27 18:22:00 +00:00
exp = Experiment(create_git_tag=True)
Trainer(experiment=exp)
2019-06-27 17:58:13 +00:00
```
2019-06-27 17:47:19 +00:00
2019-07-27 22:40:29 +00:00
---
### Tensorboard support
2019-08-01 14:11:26 +00:00
The experiment object is a strict subclass of PyTorch SummaryWriter. However, this class
2019-07-27 22:40:29 +00:00
also snapshots every detail about the experiment (data folder paths, code, hyperparams),
and allows you to visualize it using tensorboard.
``` {.python}
from test_tube import Experiment, HyperOptArgumentParser
# exp hyperparams
args = HyperOptArgumentParser()
hparams = args.parse_args()
# this is a summaryWriter with nicer logging structure
exp = Experiment(save_dir='/some/path', create_git_tag=True)
# track experiment details (must be ArgumentParser or HyperOptArgumentParser).
# each option in the parser is tracked
exp.argparse(hparams)
exp.tag({'description': 'running demo'})
# trainer uses the exp object to log exp data
trainer = Trainer(experiment=exp)
trainer.fit(model)
# view logs at:
# tensorboard --logdir /some/path
```
2019-06-27 18:22:00 +00:00
2019-06-27 17:58:13 +00:00
---
#### Write logs file to csv every k batches
Every k batches, lightning will write the new logs to disk
``` {.python}
# DEFAULT (ie: save a .csv log file every 100 batches)
trainer = Trainer(log_save_interval=100)
2019-06-27 17:47:19 +00:00
```
2019-06-27 17:58:13 +00:00