lightning/pytorch_lightning/trainer
Alexey Karnachev ddbf7de6dc
Added accumulation of loggers' metrics for the same steps (#1278)
* `add_argparse_args` method fixed (argument types added)

* autopep8 fixes

* --gpus=0 removed from test (for ci tests)

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Joe Davison <joe@huggingface.co>

* test_with_accumulate_grad_batches added

* agg_and_log_metrics logic added to the base logger class

* small format fix

* agg metrics strategies removed (not to complicate stuff)

* agg metrics: handle zero step

* autopep8

* changelog upd

* flake fix

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove .item which causes sync issues (#1254)

* remove .item which causes sync issues

* fixed gradient acc sched

* fixed gradient acc sched

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* autopep8

* loggers base.py types fixed

* test

* test

* metrics aggregation for loggers: each key now has a specific function (or default one)

* metrics aggregation for loggers: each key now has a specific function (or default one)

* docstrings upd

* manual typehints removed from docstrings

* batch_size decreased for test `test_with_accumulate_grad_batches`

* extend running accum

* refactor

* fix tests

* fix tests

* allowed_types generator scoped

* trainer.py distutils was imported twice, fixed

* TensorRunningAccum refactored

* TensorRunningAccum added to change log (Changed)

* change log pull link added

Co-authored-by: Joe Davison <joe@huggingface.co>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-08 08:35:47 -04:00
..
__init__.py add trainer attribute to denote if interrupted (#1368) 2020-04-05 11:12:41 -04:00
auto_mix_precision.py Set precision=16 when use_amp is passed as True (#1145) 2020-04-06 08:13:24 -04:00
callback_config.py proper checkpoint implementation (#1043) 2020-03-04 23:02:19 -05:00
callback_hook.py cleaning imports (#1032) 2020-03-12 12:41:37 -04:00
data_loading.py Add warning for few workers (#1378) 2020-04-05 11:07:16 -04:00
deprecated_api.py Simplify progress bar args (#1108) 2020-04-03 00:53:00 +02:00
distrib_data_parallel.py load_spawn_weights only in proc rank 0 (#1385) 2020-04-06 10:17:16 -04:00
distrib_parts.py Set precision=16 when use_amp is passed as True (#1145) 2020-04-06 08:13:24 -04:00
evaluation_loop.py Fix for incorrect run on the validation set with overwritten validation_epoch_end and test_end (#1353) 2020-04-03 09:25:32 -04:00
ignored_warnings.py Dim 0 warning (#256) 2019-09-26 13:20:54 -04:00
logging.py Added accumulation of loggers' metrics for the same steps (#1278) 2020-04-08 08:35:47 -04:00
model_hooks.py quick patch __code__ (#1352) 2020-04-03 08:40:02 -04:00
optimizers.py Update optimizers.py (#1383) 2020-04-07 09:09:23 -04:00
supporters.py Added accumulation of loggers' metrics for the same steps (#1278) 2020-04-08 08:35:47 -04:00
trainer.py Added accumulation of loggers' metrics for the same steps (#1278) 2020-04-08 08:35:47 -04:00
training_io.py Checkpointing interval (#1272) 2020-03-30 18:37:02 -04:00
training_loop.py Added accumulation of loggers' metrics for the same steps (#1278) 2020-04-08 08:35:47 -04:00
training_tricks.py nan detection and intervention (#1097) 2020-03-19 09:24:45 -04:00