2020-06-16 11:42:56 +00:00
.. testsetup :: *
2020-06-17 21:44:11 +00:00
import torch
2020-06-16 11:42:56 +00:00
from torch.nn import Module
from pytorch_lightning.core.lightning import LightningModule
2020-10-06 21:03:24 +00:00
from pytorch_lightning.metrics import Metric
2020-06-16 11:42:56 +00:00
2020-08-13 22:56:51 +00:00
.. _metrics:
2020-06-16 11:42:56 +00:00
Metrics
=======
2020-10-06 21:03:24 +00:00
`` pytorch_lightning.metrics `` is a Metrics API created for easy metric development and usage in
PyTorch and PyTorch Lightning. It is rigorously tested for all edge cases and includes a growing list of
common metric implementations.
2020-06-16 11:42:56 +00:00
2020-10-06 21:03:24 +00:00
The metrics API provides `` update() `` , `` compute() `` , `` reset() `` functions to the user. The metric base class inherits
`` nn.Module `` which allows us to call `` metric(...) `` directly. The `` forward() `` method of the base `` Metric `` class
serves the dual purpose of calling `` update() `` on its input and simultanously returning the value of the metric over the
provided input.
2020-06-16 11:42:56 +00:00
2020-10-06 21:03:24 +00:00
These metrics work with DDP in PyTorch and PyTorch Lightning by default. When `` .compute() `` is called in
distributed mode, the internal state of each metric is synced and reduced across each process, so that the
logic present in `` .compute() `` is applied to state information from all processes.
2020-06-16 11:42:56 +00:00
2020-10-06 21:03:24 +00:00
The example below shows how to use a metric in your `` LightningModule `` :
2020-06-16 11:42:56 +00:00
2020-06-17 21:44:11 +00:00
.. code-block :: python
2020-06-17 14:53:48 +00:00
2020-10-06 21:03:24 +00:00
def __init__(self):
...
self.accuracy = pl.metrics.Accuracy()
def training_step(self, batch, batch_idx):
logits = self(x)
...
# log step metric
self.log('train_acc_step', self.accuracy(logits, y))
...
def training_epoch_end(self, outs):
# log epoch metric
self.log('train_acc_epoch', self.accuracy.compute())
2020-06-17 14:53:48 +00:00
2020-10-08 02:54:32 +00:00
`` Metric `` objects can also be directly logged, in which case Lightning will log
the metric based on `` on_step `` and `` on_epoch `` flags present in `` self.log(...) `` .
If `` on_epoch `` is True, the logger automatically logs the end of epoch metric value by calling
`` .compute() `` .
.. note ::
`` sync_dist `` , `` sync_dist_op `` , `` sync_dist_group `` , `` reduce_fx `` and `` tbptt_reduce_fx ``
flags from `` self.log(...) `` don't affect the metric logging in any manner. The metric class
contains its own distributed synchronization logic.
This however is only true for metrics that inherit the base class `` Metric `` ,
and thus the functional metric API provides no support for in-built distributed synchronization
or reduction functions.
.. code-block :: python
def __init__(self):
...
self.train_acc = pl.metrics.Accuracy()
self.valid_acc = pl.metrics.Accuracy()
def training_step(self, batch, batch_idx):
logits = self(x)
...
self.train_acc(logits, y)
self.log('train_acc', self.train_acc, on_step=True, on_epoch=False)
def validation_step(self, batch, batch_idx):
logits = self(x)
...
self.valid_acc(logits, y)
self.log('valid_acc', self.valid_acc, on_step=True, on_epoch=True)
2020-10-06 21:03:24 +00:00
This metrics API is independent of PyTorch Lightning. Metrics can directly be used in PyTorch as shown in the example:
2020-06-17 14:53:48 +00:00
.. code-block :: python
2020-10-06 21:03:24 +00:00
from pytorch_lightning import metrics
train_accuracy = metrics.Accuracy()
valid_accuracy = metrics.Accuracy(compute_on_step=False)
for epoch in range(epochs):
for x, y in train_data:
y_hat = model(x)
# training step accuracy
batch_acc = train_accuracy(y_hat, y)
for x, y in valid_data:
y_hat = model(x)
valid_accuracy(y_hat, y)
# total accuracy over all training batches
total_train_accuracy = train_accuracy.compute()
# total accuracy over all validation batches
total_valid_accuracy = train_accuracy.compute()
2020-06-16 11:42:56 +00:00
2020-10-06 21:03:24 +00:00
Implementing a Metric
2020-06-16 11:42:56 +00:00
---------------------
2020-06-18 13:06:31 +00:00
2020-10-06 21:03:24 +00:00
To implement your custom metric, subclass the base `` Metric `` class and implement the following methods:
2020-06-17 11:34:39 +00:00
2020-10-06 21:03:24 +00:00
- `` __init__() `` : Each state variable should be called using `` self.add_state(...) `` .
- `` update() `` : Any code needed to update the state given any inputs to the metric.
- `` compute() `` : Computes a final value from the state of the metric.
2020-06-17 11:34:39 +00:00
2020-10-07 18:25:52 +00:00
All you need to do is call `` add_state `` correctly to implement a custom metric with DDP.
2020-10-06 21:03:24 +00:00
`` reset() `` is called on metric state variables added using `` add_state() `` .
2020-07-22 13:58:24 +00:00
2020-10-06 21:03:24 +00:00
To see how metric states are synchronized across distributed processes, refer to `` add_state() `` docs
from the base `` Metric `` class.
2020-06-17 11:34:39 +00:00
2020-10-06 21:03:24 +00:00
Example implementation:
2020-06-17 11:34:39 +00:00
.. code-block :: python
2020-07-22 13:58:24 +00:00
2020-10-06 21:03:24 +00:00
from pytorch_lightning.metrics import Metric
2020-07-22 13:58:24 +00:00
2020-10-06 21:03:24 +00:00
class MyAccuracy(Metric):
def __init__(self, ddp_sync_on_step=False):
super().__init__(ddp_sync_on_step=ddp_sync_on_step)
2020-06-17 11:34:39 +00:00
2020-10-06 21:03:24 +00:00
self.add_state("correct", default=torch.tensor(0), dist_reduce_fx="sum")
self.add_state("total", default=torch.tensor(0), dist_reduce_fx="sum")
2020-06-17 11:34:39 +00:00
2020-10-06 21:03:24 +00:00
def update(self, preds: torch.Tensor, target: torch.Tensor):
preds, target = self._input_format(preds, target)
assert preds.shape == target.shape
2020-06-17 11:34:39 +00:00
2020-10-06 21:03:24 +00:00
self.correct += torch.sum(preds == target)
self.total += target.numel()
2020-06-17 11:34:39 +00:00
2020-10-06 21:03:24 +00:00
def compute(self):
return self.correct.float() / self.total
2020-08-05 09:32:53 +00:00
2020-10-06 21:03:24 +00:00
Metric
^^^^^^
2020-06-17 11:34:39 +00:00
2020-10-06 21:03:24 +00:00
.. autoclass :: pytorch_lightning.metrics.Metric
2020-06-17 11:34:39 +00:00
:noindex:
2020-10-06 21:03:24 +00:00
Classification Metrics
----------------------
2020-06-17 11:34:39 +00:00
2020-10-06 21:03:24 +00:00
Accuracy
2020-06-17 11:34:39 +00:00
^^^^^^^^
2020-10-06 21:03:24 +00:00
.. autoclass :: pytorch_lightning.metrics.classification.Accuracy
2020-08-05 09:32:53 +00:00
:noindex:
2020-09-01 18:59:33 +00:00
2020-10-06 21:03:24 +00:00
Regression Metrics
------------------
2020-09-01 18:59:33 +00:00
2020-10-06 21:03:24 +00:00
MeanSquaredError
^^^^^^^^^^^^^^^^
2020-08-05 09:32:53 +00:00
2020-10-06 21:03:24 +00:00
.. autoclass :: pytorch_lightning.metrics.regression.MeanSquaredError
2020-08-05 09:32:53 +00:00
:noindex:
2020-09-01 18:59:33 +00:00
2020-10-06 21:03:24 +00:00
MeanAbsoluteError
^^^^^^^^^^^^^^^^^
2020-08-05 09:32:53 +00:00
2020-10-06 21:03:24 +00:00
.. autoclass :: pytorch_lightning.metrics.regression.MeanAbsoluteError
2020-08-05 09:32:53 +00:00
:noindex:
2020-09-01 18:59:33 +00:00
2020-10-06 21:03:24 +00:00
MeanSquaredLogError
^^^^^^^^^^^^^^^^^^^
2020-08-05 09:32:53 +00:00
2020-10-06 21:03:24 +00:00
.. autoclass :: pytorch_lightning.metrics.regression.MeanSquaredLogError
2020-08-05 09:32:53 +00:00
:noindex:
2020-10-07 19:12:15 +00:00
Functional Metrics
==================
The functional metrics follow the simple paradigm input in, output out. This means, they don't provide any advanced mechanisms for syncing across DDP nodes or aggregation over batches. They simply compute the metric value based on the given inputs.
Also the integration within other parts of PyTorch Lightning will never be as tight as with the class-based interface.
If you look for just computing the values, the functional metrics are the way to go. However, if you are looking for the best integration and user experience, please consider also to use the class interface.
Classification
--------------
accuracy [func]
^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.accuracy
:noindex:
auc [func]
^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.auc
:noindex:
auroc [func]
^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.auroc
:noindex:
average_precision [func]
^^^^^^^^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.average_precision
:noindex:
confusion_matrix [func]
^^^^^^^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.confusion_matrix
:noindex:
dice_score [func]
^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.dice_score
:noindex:
f1_score [func]
^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.f1_score
:noindex:
fbeta_score [func]
^^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.fbeta_score
:noindex:
iou [func]
^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.iou
:noindex:
multiclass_roc [func]
^^^^^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.multiclass_roc
:noindex:
precision [func]
^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.precision
:noindex:
precision_recall [func]
^^^^^^^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.precision_recall
:noindex:
precision_recall_curve [func]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.precision_recall_curve
:noindex:
recall [func]
^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.recall
:noindex:
roc [func]
^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.roc
:noindex:
stat_scores [func]
^^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.stat_scores
:noindex:
stat_scores_multiple_classes [func]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.stat_scores_multiple_classes
:noindex:
to_categorical [func]
^^^^^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.to_categorical
:noindex:
to_onehot [func]
^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.classification.to_onehot
:noindex:
Regression
----------
mae [func]
^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.regression.mae
:noindex:
mse [func]
^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.regression.mse
:noindex:
psnr [func]
^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.regression.psnr
:noindex:
rmse [func]
^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.regression.rmse
:noindex:
rmsle [func]
^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.regression.rmsle
:noindex:
ssim [func]
^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.regression.mae
:noindex:
NLP
---
bleu_score [func]
^^^^^^^^^^^^^^^^^
.. autofunction :: pytorch_lightning.metrics.functional.nlp.bleu_score
:noindex: