2019-09-27 16:05:29 +00:00
Lighting offers options for logging information about model, gpu usage, etc, via several different logging frameworks. It also offers printing options for training monitoring.
2019-06-27 17:47:19 +00:00
2019-10-04 23:48:57 +00:00
---
### default_save_path
Lightning sets a default TestTubeLogger and CheckpointCallback for you which log to
```os.getcwd()``` by default. To modify the logging path you can set:
```python
Trainer(default_save_path='/your/path/to/save/checkpoints')
```
If you need more custom behavior (different paths for both, different metrics, etc...)
from the logger and the checkpointCallback, pass in your own instances as explained below.
2019-06-27 17:47:19 +00:00
---
2019-09-27 16:05:29 +00:00
### Setting up logging
2019-10-04 23:48:57 +00:00
The trainer inits a default logger for you (TestTubeLogger). All logs will
go to the current working directory under a folder named ```os.getcwd()/lightning_logs``.
If you want to modify the default logging behavior even more, pass in a logger
(which should inherit from `LightningBaseLogger` ).
2019-09-27 16:05:29 +00:00
```{.python}
my_logger = MyLightningLogger(...)
trainer = Trainer(logger=my_logger)
```
2019-10-04 23:48:57 +00:00
The path in this logger will overwrite default_save_path.
2019-09-27 16:05:29 +00:00
Lightning supports several common experiment tracking frameworks out of the box
---
#### Test tube
Log using [test tube ](https://williamfalcon.github.io/test-tube/ ).
```{.python}
from pytorch_lightning.logging import TestTubeLogger
tt_logger = TestTubeLogger(
save_dir=".",
name="default",
debug=False,
create_git_tag=False
)
trainer = Trainer(logger=tt_logger)
```
---
#### MLFlow
Log using [mlflow ](https://mlflow.org )
```{.python}
from pytorch_lightning.logging import MLFlowLogger
mlf_logger = MLFlowLogger(
experiment_name="default",
tracking_uri="file:/."
)
trainer = Trainer(logger=mlf_logger)
```
---
#### Custom logger
You can implement your own logger by writing a class that inherits from
`LightningLoggerBase` . Use the `rank_zero_only` decorator to make sure that
only the first process in DDP training logs data.
```{.python}
from pytorch_lightning.logging import LightningLoggerBase, rank_zero_only
class MyLogger(LightningLoggerBase):
@rank_zero_only
def log_hyperparams(self, params):
# params is an argparse.Namespace
# your code to record hyperparameters goes here
pass
@rank_zero_only
def log_metrics(self, metrics, step_num):
# metrics is a dictionary of metric names and values
# your code to record metrics goes here
pass
def save(self):
# Optional. Any code necessary to save logger data goes here
pass
@rank_zero_only
def finalize(self, status):
# Optional. Any code that needs to be run after training
# finishes goes here
```
If you write a logger than may be useful to others, please send
a pull request to add it to Lighting!
---
2019-10-04 22:53:38 +00:00
#### Using loggers
You can call the logger anywhere from your LightningModule by doing:
```python
self.logger
# add an image if using TestTubeLogger
self.logger.experiment.add_image(...)
```
2019-09-27 16:05:29 +00:00
2019-06-27 17:47:19 +00:00
#### Display metrics in progress bar
``` {.python}
# DEFAULT
2019-08-24 01:23:27 +00:00
trainer = Trainer(show_progress_bar=True)
2019-06-27 17:47:19 +00:00
```
2019-06-27 18:22:00 +00:00
---
#### Log metric row every k batches
Every k batches lightning will make an entry in the metrics log
``` {.python}
# DEFAULT (ie: save a .csv log file every 10 batches)
2019-09-25 23:05:06 +00:00
trainer = Trainer(row_log_interval=10)
2019-09-04 14:43:46 +00:00
```
---
2019-09-27 16:05:29 +00:00
#### Log GPU memory
2019-09-04 14:43:46 +00:00
Logs GPU memory when metrics are logged.
``` {.python}
# DEFAULT
2019-10-05 15:29:34 +00:00
trainer = Trainer(log_gpu_memory=None)
# log only the min/max utilization
trainer = Trainer(log_gpu_memory='min_max')
# log all the GPU memory (if on DDP, logs only that node)
trainer = Trainer(log_gpu_memory='all')
2019-06-27 18:22:00 +00:00
```
2019-06-27 17:47:19 +00:00
2019-06-27 17:58:13 +00:00
---
#### Process position
When running multiple models on the same machine we want to decide which progress bar to use.
Lightning will stack progress bars according to this value.
``` {.python}
# DEFAULT
trainer = Trainer(process_position=0)
# if this is the second model on the node, show the second progress bar below
trainer = Trainer(process_position=1)
```
2019-06-27 17:47:19 +00:00
---
2019-06-27 17:58:13 +00:00
#### Save a snapshot of all hyperparameters
2019-10-02 15:10:40 +00:00
Automatically log hyperparameters stored in the `hparams` attribute as an `argparse.Namespace`
2019-06-27 18:22:00 +00:00
``` {.python}
2019-07-27 22:40:29 +00:00
2019-10-02 15:10:40 +00:00
class MyModel(pl.Lightning):
def __init__ (self, hparams):
self.hparams = hparams
...
args = parser.parse_args()
model = MyModel(args)
logger = TestTubeLogger(...)
t = Trainer(logger=logger)
trainer.fit(model)
2019-07-27 22:40:29 +00:00
```
2019-06-27 18:22:00 +00:00
2019-06-27 17:58:13 +00:00
---
#### Write logs file to csv every k batches
Every k batches, lightning will write the new logs to disk
``` {.python}
# DEFAULT (ie: save a .csv log file every 100 batches)
trainer = Trainer(log_save_interval=100)
2019-06-27 17:47:19 +00:00
```
2019-06-27 17:58:13 +00:00