lightning/tests/metrics/utils.py

272 lines
9.3 KiB
Python
Raw Normal View History

import os
import pickle
import sys
from functools import partial
from typing import Callable
import numpy as np
import pytest
import torch
from torch.multiprocessing import Pool, set_start_method
from pytorch_lightning.metrics import Metric
Classification metrics overhaul: accuracy metrics (2/n) (#4838) * Add stuff * Change metrics documentation layout * Add stuff * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * PEP 8 compliance * Division with float * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update to new top_k default * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Split accuracy and hamming loss * Remove old redundant accuracy * Minor changes * Fix imports * Improve docstring descriptions * Fix edge case and simplify testing * Fix docs * PEP8 * Reorder imports * Update changelog * Update docstring * Update docstring * Reverse formatting changes for tests * Change parameter order * Remove formatting changes 2/2 * Remove formatting 3/3 * . * Improve description of top_k parameter * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Remove unneeded assert * Update pytorch_lightning/metrics/functional/accuracy.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Remove unneeded assert * Explicit checking of parameter values * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Apply suggestions from code review * Fix top_k checking * PEP8 * Don't check dist_sync in test * add back check_dist_sync_on_step * Make sure half-precision inputs are transformed (#5013) * Fix typo * Rename hamming loss to hamming distance * Fix tests for half precision * Fix docs underline length * Fix doc undeline length * Replace mdmc_accuracy parameter with subset_accuracy * Update changelog * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Suggestions from code review * Fix number in docs * Update pytorch_lightning/metrics/classification/accuracy.py * Replace topk by argsort in select_topk * Fix changelog * Add test for wrong params * Add Google Colab badges (#5111) * Add colab badges to notebook Add colab badges to notebook to notebooks 4 & 5 * Add colab badges Co-authored-by: chaton <thomas@grid.ai> * Fix hanging metrics tests (#5134) * Use torch.topk again as ddp hanging tests fixed in #5134 * Fix unwanted notebooks change * Fix too long line in hamming_distance * Apply suggestions from code review * Apply suggestions from code review * protect * Update CHANGELOG.md Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: Roger Shieh <sh.rog@protonmail.ch> Co-authored-by: Shachar Mirkin <shacharmirkin@gmail.com>
2020-12-21 15:42:51 +00:00
try:
set_start_method("spawn")
except RuntimeError:
pass
NUM_PROCESSES = 2
NUM_BATCHES = 10
BATCH_SIZE = 32
NUM_CLASSES = 5
EXTRA_DIM = 3
THRESHOLD = 0.5
def setup_ddp(rank, world_size):
""" Setup ddp enviroment """
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
os.environ["MASTER_ADDR"] = "localhost"
os.environ["MASTER_PORT"] = "8088"
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
if torch.distributed.is_available() and sys.platform not in ("win32", "cygwin"):
torch.distributed.init_process_group("gloo", rank=rank, world_size=world_size)
def _assert_allclose(pl_result, sk_result, atol: float = 1e-8):
""" Utility function for recursively asserting that two results are within
a certain tolerance
"""
# single output compare
if isinstance(pl_result, torch.Tensor):
assert np.allclose(pl_result.numpy(), sk_result, atol=atol, equal_nan=True)
# multi output compare
elif isinstance(pl_result, (tuple, list)):
for pl_res, sk_res in zip(pl_result, sk_result):
_assert_allclose(pl_res, sk_res, atol=atol)
else:
raise ValueError('Unknown format for comparison')
def _assert_tensor(pl_result):
""" Utility function for recursively checking that some input only consist of
torch tensors
"""
if isinstance(pl_result, (list, tuple)):
for plr in pl_result:
_assert_tensor(plr)
else:
assert isinstance(pl_result, torch.Tensor)
def _class_test(
rank: int,
worldsize: int,
preds: torch.Tensor,
target: torch.Tensor,
metric_class: Metric,
sk_metric: Callable,
dist_sync_on_step: bool,
metric_args: dict = {},
check_dist_sync_on_step: bool = True,
check_batch: bool = True,
atol: float = 1e-8,
):
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
"""Utility function doing the actual comparison between lightning class metric
and reference metric.
Args:
rank: rank of current process
worldsize: number of processes
preds: torch tensor with predictions
target: torch tensor with targets
metric_class: lightning metric class that should be tested
sk_metric: callable function that is used for comparison
dist_sync_on_step: bool, if true will synchronize metric state across
processes at each ``forward()``
metric_args: dict with additional arguments used for class initialization
check_dist_sync_on_step: bool, if true will check if the metric is also correctly
calculated per batch per device (and not just at the end)
check_batch: bool, if true will check if the metric is also correctly
calculated across devices for each batch (and not just at the end)
"""
# Instanciate lightning metric
metric = metric_class(compute_on_step=True, dist_sync_on_step=dist_sync_on_step, **metric_args)
# verify metrics work after being loaded from pickled state
pickled_metric = pickle.dumps(metric)
metric = pickle.loads(pickled_metric)
for i in range(rank, NUM_BATCHES, worldsize):
batch_result = metric(preds[i], target[i])
if metric.dist_sync_on_step:
if rank == 0:
ddp_preds = torch.cat([preds[i + r] for r in range(worldsize)])
ddp_target = torch.cat([target[i + r] for r in range(worldsize)])
sk_batch_result = sk_metric(ddp_preds, ddp_target)
# assert for dist_sync_on_step
if check_dist_sync_on_step:
_assert_allclose(batch_result, sk_batch_result, atol=atol)
else:
sk_batch_result = sk_metric(preds[i], target[i])
# assert for batch
if check_batch:
_assert_allclose(batch_result, sk_batch_result, atol=atol)
# check on all batches on all ranks
result = metric.compute()
_assert_tensor(result)
total_preds = torch.cat([preds[i] for i in range(NUM_BATCHES)])
total_target = torch.cat([target[i] for i in range(NUM_BATCHES)])
sk_result = sk_metric(total_preds, total_target)
# assert after aggregation
_assert_allclose(result, sk_result, atol=atol)
def _functional_test(
preds: torch.Tensor,
target: torch.Tensor,
metric_functional: Callable,
sk_metric: Callable,
metric_args: dict = {},
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
atol: float = 1e-8,
):
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
"""Utility function doing the actual comparison between lightning functional metric
and reference metric.
Args:
preds: torch tensor with predictions
target: torch tensor with targets
metric_functional: lightning metric functional that should be tested
sk_metric: callable function that is used for comparison
metric_args: dict with additional arguments used for class initialization
"""
metric = partial(metric_functional, **metric_args)
for i in range(NUM_BATCHES):
lightning_result = metric(preds[i], target[i])
sk_result = sk_metric(preds[i], target[i])
# assert its the same
_assert_allclose(lightning_result, sk_result, atol=atol)
class MetricTester:
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
"""Class used for efficiently run alot of parametrized tests in ddp mode.
Makes sure that ddp is only setup once and that pool of processes are
used for all tests.
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
All tests should subclass from this and implement a new method called
`test_metric_name`
where the method `self.run_metric_test` is called inside.
"""
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
atol = 1e-8
def setup_class(self):
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
"""Setup the metric class. This will spawn the pool of workers that are
used for metric testing and setup_ddp
"""
Classification metrics overhaul: accuracy metrics (2/n) (#4838) * Add stuff * Change metrics documentation layout * Add stuff * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * PEP 8 compliance * Division with float * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update to new top_k default * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Split accuracy and hamming loss * Remove old redundant accuracy * Minor changes * Fix imports * Improve docstring descriptions * Fix edge case and simplify testing * Fix docs * PEP8 * Reorder imports * Update changelog * Update docstring * Update docstring * Reverse formatting changes for tests * Change parameter order * Remove formatting changes 2/2 * Remove formatting 3/3 * . * Improve description of top_k parameter * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Remove unneeded assert * Update pytorch_lightning/metrics/functional/accuracy.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Remove unneeded assert * Explicit checking of parameter values * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Apply suggestions from code review * Fix top_k checking * PEP8 * Don't check dist_sync in test * add back check_dist_sync_on_step * Make sure half-precision inputs are transformed (#5013) * Fix typo * Rename hamming loss to hamming distance * Fix tests for half precision * Fix docs underline length * Fix doc undeline length * Replace mdmc_accuracy parameter with subset_accuracy * Update changelog * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Suggestions from code review * Fix number in docs * Update pytorch_lightning/metrics/classification/accuracy.py * Replace topk by argsort in select_topk * Fix changelog * Add test for wrong params * Add Google Colab badges (#5111) * Add colab badges to notebook Add colab badges to notebook to notebooks 4 & 5 * Add colab badges Co-authored-by: chaton <thomas@grid.ai> * Fix hanging metrics tests (#5134) * Use torch.topk again as ddp hanging tests fixed in #5134 * Fix unwanted notebooks change * Fix too long line in hamming_distance * Apply suggestions from code review * Apply suggestions from code review * protect * Update CHANGELOG.md Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: Roger Shieh <sh.rog@protonmail.ch> Co-authored-by: Shachar Mirkin <shacharmirkin@gmail.com>
2020-12-21 15:42:51 +00:00
self.poolSize = NUM_PROCESSES
self.pool = Pool(processes=self.poolSize)
self.pool.starmap(setup_ddp, [(rank, self.poolSize) for rank in range(self.poolSize)])
def teardown_class(self):
""" Close pool of workers """
self.pool.close()
self.pool.join()
def run_functional_metric_test(
self,
preds: torch.Tensor,
target: torch.Tensor,
metric_functional: Callable,
sk_metric: Callable,
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
metric_args: dict = {},
):
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
"""Main method that should be used for testing functions. Call this inside
testing method
Args:
preds: torch tensor with predictions
target: torch tensor with targets
metric_functional: lightning metric class that should be tested
sk_metric: callable function that is used for comparison
metric_args: dict with additional arguments used for class initialization
"""
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
_functional_test(
preds=preds,
target=target,
metric_functional=metric_functional,
sk_metric=sk_metric,
metric_args=metric_args,
atol=self.atol,
)
def run_class_metric_test(
self,
ddp: bool,
preds: torch.Tensor,
target: torch.Tensor,
metric_class: Metric,
sk_metric: Callable,
dist_sync_on_step: bool,
metric_args: dict = {},
check_dist_sync_on_step: bool = True,
check_batch: bool = True,
):
Classification metrics overhaul: input formatting standardization (1/n) (#4837) * Add stuff * Change metrics documentation layout * Change testing utils * Replace len(*.shape) with *.ndim * More descriptive error message for input formatting * Replace movedim with permute * Style changes in error messages * More error message style improvements * Fix typo in docs * Add more descriptive variable names in utils * Change internal var names * Break down error checking for inputs into separate functions * Remove the (N, ..., C) option in MD-MC * Simplify select_topk * Remove detach for inputs * Fix typos * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update docs/source/metrics.rst Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Minor error message changes * Update pytorch_lightning/metrics/utils.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Reuse case from validation in formatting * Refactor code in _input_format_classification * Small improvements * PEP 8 * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update docs/source/metrics.rst Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/classification/utils.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Alphabetical reordering of regression metrics * Change default value of top_k and add error checking * Extract basic validation into separate function * Update desciption of parameters in input formatting * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Check that probabilities in preds sum to 1 (for MC) * Fix coverage * Minor changes * Fix edge case and simplify testing Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-12-07 16:49:35 +00:00
"""Main method that should be used for testing class. Call this inside testing
methods.
Args:
ddp: bool, if running in ddp mode or not
preds: torch tensor with predictions
target: torch tensor with targets
metric_class: lightning metric class that should be tested
sk_metric: callable function that is used for comparison
dist_sync_on_step: bool, if true will synchronize metric state across
processes at each ``forward()``
metric_args: dict with additional arguments used for class initialization
check_dist_sync_on_step: bool, if true will check if the metric is also correctly
calculated per batch per device (and not just at the end)
check_batch: bool, if true will check if the metric is also correctly
calculated across devices for each batch (and not just at the end)
"""
if ddp:
if sys.platform == "win32":
pytest.skip("DDP not supported on windows")
self.pool.starmap(
partial(
_class_test,
preds=preds,
target=target,
metric_class=metric_class,
sk_metric=sk_metric,
dist_sync_on_step=dist_sync_on_step,
metric_args=metric_args,
check_dist_sync_on_step=check_dist_sync_on_step,
check_batch=check_batch,
atol=self.atol,
),
[(rank, self.poolSize) for rank in range(self.poolSize)],
)
else:
_class_test(
0,
1,
preds=preds,
target=target,
metric_class=metric_class,
sk_metric=sk_metric,
dist_sync_on_step=dist_sync_on_step,
metric_args=metric_args,
check_dist_sync_on_step=check_dist_sync_on_step,
check_batch=check_batch,
atol=self.atol,
)