Commit Graph

382 Commits

Author SHA1 Message Date
Alexey Karnachev 4c34d16a34
Fixed configure optimizer from dict without "scheduler" key (#1443)
* `configure_optimizer` from dict with only "optimizer" key. bug fixed

* autopep8

* pep8speaks suggested fixes

* CHANGELOG.md upd
2020-04-10 11:43:06 -04:00
Alex Sergeev 8dd9b80d7a
Fix gradient clipping (#1438)
* Fix gradient clipping

* Relax accuracy constraint
2020-04-09 21:08:28 -04:00
Jirka Borovec 17f58d2e11
add rank warning (#1428)
* add rank warning

* changelog

* use rank_zero_warn

* user trainer_init

* replace warnings

* fix test

* flake8

* docs

* changelog

* bug lol
2020-04-09 14:05:46 -04:00
Alexey Karnachev ddbf7de6dc
Added accumulation of loggers' metrics for the same steps (#1278)
* `add_argparse_args` method fixed (argument types added)

* autopep8 fixes

* --gpus=0 removed from test (for ci tests)

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Joe Davison <joe@huggingface.co>

* test_with_accumulate_grad_batches added

* agg_and_log_metrics logic added to the base logger class

* small format fix

* agg metrics strategies removed (not to complicate stuff)

* agg metrics: handle zero step

* autopep8

* changelog upd

* flake fix

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove .item which causes sync issues (#1254)

* remove .item which causes sync issues

* fixed gradient acc sched

* fixed gradient acc sched

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* autopep8

* loggers base.py types fixed

* test

* test

* metrics aggregation for loggers: each key now has a specific function (or default one)

* metrics aggregation for loggers: each key now has a specific function (or default one)

* docstrings upd

* manual typehints removed from docstrings

* batch_size decreased for test `test_with_accumulate_grad_batches`

* extend running accum

* refactor

* fix tests

* fix tests

* allowed_types generator scoped

* trainer.py distutils was imported twice, fixed

* TensorRunningAccum refactored

* TensorRunningAccum added to change log (Changed)

* change log pull link added

Co-authored-by: Joe Davison <joe@huggingface.co>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-08 08:35:47 -04:00
Jeremy Jordan 91c9b29d47
add trainer attribute to denote if interrupted (#1368)
* add trainer attribute to denote if interrupted

* bugfix and formatting
2020-04-05 11:12:41 -04:00
Ethan Harris b18accc64c
Add warning for few workers (#1378)
* Add warning for few workers

* Fix style issue

* Update CHANGELOG.md

* Update test

* formatting

* formatting

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-05 11:07:16 -04:00
Justus Schock f6a86e8551
generalize reinstantiation of dataloader (#1346)
* generalize reinstantiation of dataloader

* fix condition

* add test

* update changelog

* fix changelog

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-03 17:55:08 -04:00
William Falcon 3c5530c29d
Wandb bug/wandb multi (#1360)
* Allow reinits in sub procs

* Dont create an experiment on pickle, name, or project

* Comments consistency

* Fix test

* Apply suggestions from code review

Co-authored-by: Chris Van Pelt <vanpelt@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-03 15:03:00 -04:00
William Falcon dd5a05926c
Borisdayma: fix(wandb) - fix watch method (#1361)
* fix(wandb): fix watch method

* rebased

* Apply suggestions from code review

Co-authored-by: Boris Dayma <boris.dayma@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-03 15:02:38 -04:00
Adrian Wälchli ebd9fc9530
Fix for incorrect run on the validation set with overwritten validation_epoch_end and test_end (#1353)
* reorder if clauses

* fix wrong method overload in test

* fix formatting

* update change_log

* fix line too long
2020-04-03 09:25:32 -04:00
Jean-Baptiste SCHIRATTI 868b172f05
Make training_epoch_end behave like validation_epoch_end (#1357)
* Make training_epoch_end behave like validation_epoch_end + minor fixes in docstrings.

* Minor fixes (Borda's comments).

* Detach tensors in batch_output (to avoid possible memory leak) + doc fix.

Co-authored-by: Jean-Baptiste SCHIRATTI <jean-baptisteschiratti@MacBook-Pro-de-Jean-Baptiste.local>
2020-04-03 14:43:26 +02:00
Gerard Bentley f33b5a8d99
Simplify progress bar args (#1108)
* show progress bar dependent on refresh_rate

* test progress_bar_refresh control show bar

* remove show_progress_bar from other tests

* borda fixes

* flake8 fix

* changelog update prog bar refresh rate

* move show_progress_bar to deprecated 0.9 api

* rm show_progress_bar references, test deprecated

* Update pytorch_lightning/trainer/__init__.py

* fix test

* changelog

* minor CHANGELOG.md format

* Update pytorch_lightning/trainer/__init__.py

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Gerard Bentley <gbkh2015@mymail.pomona.edu>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-03 00:53:00 +02:00
Jirka Borovec 724b787cd1
faster CI testing (#1323)
* MNIST digits

* increase test acc

* smaller parity

* drone builds

* increase GH action timeout

* drone format

* fix paths

* drone cache

* circle cache

* fix test

* lower nb epochs

* circleCI

* user orb

* fix test

* fix test

* circle cache

* circle cache

* circle cache

* comment caches

* benchmark batch size

* cache dataset

* smaller dataset

* smaller dataset

* fix nb samples

* batch size

* fix test
2020-04-02 12:28:44 -04:00
Nicki Skafte 2912239fe6
Add useful errors when model is not configured correctly (#1199)
* add check_model_configuration method

* trying to fix errors

* trying to fix tests

* added test_epoch_end to lightning template

* fix tests

* fix new test after rebase

* fix spelling

* added more checks

* updated formating

* added tests

* fixed CHANGELOG

* Apply suggestions from code review

* move test to new module

* change check on configure_optimizers

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-02 11:53:37 -04:00
Ethan Harris 28242f02d1
Remove default optimizer, add None optimizer option (#1279)
* Add warning when using default optimizer

* Refactor optimizer tests to test_optimizers

* Remove default optimizer, add option to use no optimizer

* Update CHANGELOG.md

* Update pytorch_lightning/trainer/optimizers.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Fix style

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-02 11:48:53 -04:00
Asaf Manor aca8c7e6f3
Optimizer Frequencies logic, and new configure_optimizers (#1269)
* init_optimizers accepts Dict, Sequence[Dict]
and returns optimizer_frequencies.
optimizer_frequencies was added as a member of Trainer.

* Optimizer frequencies logic implemented in training_loop.
Description added to configure_optimizers in LightningModule

* optimizer frequencies tests added to test_gpu

* Fixed formatting for merging PR #1269

* Apply suggestions from code review

* Apply suggestions from code review

Co-Authored-By: Asaf Manor <32155911+asafmanor@users.noreply.github.com>

* Update trainer.py

* Moving get_optimizers_iterable() outside.

* Update note

* Apply suggestions from code review

* formatting

* formatting

* Update CHANGELOG.md

* formatting

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-31 16:41:24 +00:00
Adrian Wälchli d6646e151a
Move some tests to correct subfolder/file (#1312)
* move some tests to trainer file

* fix imports
2020-03-31 08:58:46 -04:00
Jirka Borovec 6ddb03922a
Profiler summary (#1259)
* refactor and add types

* add Prorfiler summary

* fix imports

* Revert "refactor and add types"

This reverts commit b4c552fa

* changelog

* revert rename

* fix test

* mute verbose
2020-03-31 08:57:48 -04:00
Adrian Wälchli 1aba411da9
Early stopping when validation is disabled (#1235)
* early stop fallback to train epoch

* added test

* fix imports

* update docs

* update changelog

* fix typo
2020-03-31 06:24:26 +00:00
Bilal Khan a707d4bea1
Replace Wandb callback's finalize with no-op (#1193)
* Replace Wandb callback's finalize with no-op

* Update pytorch_lightning/loggers/wandb.py

* Update wandb.py

* remove wandb logger's finalize and update tests

* update changelog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-30 18:45:06 -04:00
Nicki Skafte 2ccc7456ca
Error on zero length dataloaders (#1280)
* error_on_zero_length

* update CHANGELOG.md

* added test

* Update pytorch_lightning/trainer/data_loading.py

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-30 18:43:53 -04:00
Jirka Borovec 09167efdb5
Checkpointing interval (#1272)
* formatting

* formatting

* fix interval

* fix train loop

* fix test

* parametrize test

* Apply suggestions from code review

Co-Authored-By: Adrian Wälchli <adrian.waelchli@students.unibe.ch>

* fix calling

* flake8

* add types

Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-30 18:37:02 -04:00
Jirka Borovec 2ca5356429
clear skipping tests (#1285)
* clear skipping tests

* fix simple/multi GPU

* review: simplify
2020-03-30 18:29:23 -04:00
Jirka Borovec 31017120fd
fix incomplete RunningMean (#1309)
* fix RunningMean

* changelog

* fix none

* Update supporters.py

just needed to multiply by zero for init

* Revert "Update supporters.py"

This reverts commit 7e0da6c6

* fix NaN

* formatting

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-30 18:28:31 -04:00
Adrian Wälchli b7de42f70d
Add MNIST dataset & drop torchvision dep. from tests (#986)
* added custom mnist without torchvision dep

* move files so it does not conflict with mnist gitignore

* mock torchvision for tests

* fix line too long

* fix line too long

* fix "module level import not at top of file" warning

* move mock imports to __init__.py

* simplify MNIST a lot and download directly the .pt files

* further simplify and clean up mnist

* revert import overrides

* make as before

* drop  PIL requirement

* move mnist.py to datasets subfolder

* use logging instead of print

* choose same name as in torchvision

* remove torchvision and pillow also from yml file

* refactor if train

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* capitalized class attr

* moved mnist to models

* re-added datsets ignore

* better name for file variable

* Update mnist.py

* move dataset classes to datasets.py

* new line

* update

* update

* fix automerge

* move to base folder

* adapt testingmnist to new mnist base class

* remove temporal fix

* fix datatype

* remove old testingmnist

* readable

* fix import

* fix whitespace

* docstring

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/base/datasets.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* changelog

* added types

* Update CHANGELOG.md

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* exist->isfile

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* index -> idx

* temporary fix for trains error

* better changelog message

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-30 18:25:37 -04:00
Jirka Borovec c869dd8b8f
make evaluate private (#1260)
* make evaluate private

* changelog
2020-03-30 12:14:27 -04:00
Ethan Harris ab09faa15e
Add support for iterable datasets when val_check_interval=1.0 (#1283)
* Add support for iterable datasets when val_check_interval=1.0

* Update CHANGELOG.md
2020-03-29 15:27:44 -04:00
Jeremy Jordan 54507f417e
fix logging config and add profiler test (#1267) 2020-03-29 14:56:36 -04:00
Jirka Borovec 61177cd1c8
system info (#1234)
* system info

* update big info

* test script

* update config

* rename script

* import path
2020-03-27 08:45:52 -04:00
Tyler Yep 6772e0c197
Remove unnecessary parameters to super() in documentation and source code (#1240)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-27 12:36:50 +00:00
Jeremy Jordan d394b80ac8
calling self.forward() -> self() (#1211)
* self.forward() -> self()

* update changelog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-27 08:17:56 +01:00
Adrian Wälchli 2a4cd479e2
Disable validation when val_percent_check=0 (#1251)
* fix disable validation

* add test

* update changelog

* update docs for val_percent_check

* make "fast training" docs consistent
2020-03-27 02:07:22 +00:00
Jirka Borovec 45d671a4a8
CI: split tests-examples (#990)
* CI: split tests-examples

* tests without template

* comment depends

* CircleCI typo

* add doctest

* update test req.

* CI tests

* setup macOS

* longer train

* lover pred acc

* fix model

* rename default model

* lower tests acc

* typo

* imports

* fix test optimizer

* update calls

* fix Win

* lower Drone image

* fix call

* pytorch image

* fix test

* add dev image

* add dev image

* update image

* drone volume

* lint

* update test notes

* rename tests/models >> tests/base

* group models

* conftest

* optim imports

* typos

* fix import

* fix tests

* install AMP

* tests

* fix import
2020-03-25 07:46:27 -04:00
Alexey Karnachev ced662fc27
Custom argparser extension with Trainer arguments (argument types added) (#1147)
* `add_argparse_args` method fixed (argument types added)

* CHANGELOG.md upd

* autopep8 fixes

* --gpus=0 removed from test (for ci tests)

* typo fixed

* reduce on plateau scheduler fixed

* Trainer cli related tests moved to test_trainer_cli.py

* refactored: get_init_arguments_and_types is a public classmethod of the Trainer now

* test_get_init_arguments_and_types added

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* refactored: get_init_arguments_and_types is a public classmethod of the Trainer now

* test_get_init_arguments_and_types added

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* refactored: get_init_arguments_and_types is a public classmethod of the Trainer now

* test_get_init_arguments_and_types added

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* test_get_init_arguments_and_types added

* autopep8 fixes

* Apply suggestions from code review

* cosmetics

* cosmetics

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* `Trainer.get_init_arguments_and_types` now returns arg types wrapped in tuples (not in sets)

* deprecated args are now ignored in argparser

* get_deprecated_arg_names small refactor

* get_deprecated_arg_names bug fixed

* Trainer cli related tests moved to test_trainer_cli.py

* refactored: get_init_arguments_and_types is a public classmethod of the Trainer now

* test_get_init_arguments_and_types added

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* Trainer cli related tests moved to test_trainer_cli.py

* test_get_init_arguments_and_types added

* autopep8 fixes

* autopep8 fixes

* Apply suggestions from code review

* cosmetics

* cosmetics

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* `Trainer.get_init_arguments_and_types` now returns arg types wrapped in tuples (not in sets)

* deprecated args are now ignored in argparser

* get_deprecated_arg_names small refactor

* get_deprecated_arg_names bug fixed

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Joe Davison <joe@huggingface.co>

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Joe Davison <joe@huggingface.co>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Joe Davison <joe@huggingface.co>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-24 14:55:27 -04:00
Jeremy Jordan 4c2026bf9a
increase profiler test coverage (#1208)
* increase profiler test coverage

* fix line length

* tests for valueerror assertions
2020-03-24 09:15:16 -04:00
Jirka Borovec 3be81cb54e
test deprecated - model (#1074)
* pylint

* model API

* update test

* formatting

* disable logger

* fix checking overwrite

* fix test

* typo

* deprecated model

* fix for DDP

* drop Flake8 in GH actions

* Update pytorch_lightning/trainer/evaluation_loop.py

* fix imports

Co-authored-by: Nic Eggert <nic@eggert.io>
2020-03-20 20:51:14 +01:00
Adrian Wälchli 732eaee4d7
nan detection and intervention (#1097)
* check for nan values

* test nan detection on loss

* sys.exit

* whitespace

* detect nan and inf values in loss and params

* update

* added documentation

* moved detect nan to training loop, remove flag for print

* blank line

* test

* rename

* deprecate print_nan_grads

* deprecated print_nan_grads

* remove unused imports

* update changelog

* fix line too long

* correct deprecated version

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* raise exception instead of sysexit

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* raise exception instead of sysexit

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/training_tricks.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/training_tricks.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* fix test

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-19 09:24:45 -04:00
So Uchida 01b8991c5a
Support hierarchical dict (#1152)
* Add support for hierarchical dict

* Support nested Namespace

* Add docstring

* Migrate hparam flattening to each logger

* Modify URLs in CHANGELOG

* typo

* Simplify the conditional branch about Namespace

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update CHANGELOG.md

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* added examples section to docstring

* renamed _dict -> input_dict

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-19 09:15:47 -04:00
Jirka Borovec 22a7264e9a
improve partial Codecov (#1172)
* ignore in setup

* show report

* abs imports

* abstract pass

* cover loggers

* doctest trains

* locals

* pass

* revert tensorboard

* use tensorboardX

* revert tensorboardX

* fix trains

* Add TrainsLogger.set_credentials (#1179)

* Add TrainsLogger.set_credentials to control trains server configuration and authentication from code. Sync trains package version.
Fix CI Trains tests

* Add global TrainsLogger set_bypass_mode (#1187)

* Add global TrainsLogger set_bypass_mode skips all external communication

Co-authored-by: bmartinn <>

* rm some no-cov

Co-authored-by: Martin.B <51887611+bmartinn@users.noreply.github.com>
2020-03-19 09:14:29 -04:00
Nicki Skafte 384e124490
ReduceLROnPlateau bug fix (#1126)
* bug fix and test

* update CHANGELOG.md

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-03-16 14:35:10 -04:00
Jakub 3ad6169f18
Neptune Logger Improvements (#1084)
* removed project and experiment from getstate

* added tests for closing experiment, updated token in example to user neptuner

* updated teoken

* Update neptune.py

added a link to example experiment

* added exmaple experiment link

* dropped duplication

* flake fixes

* merged with master, added changes information to CHANGELOG
2020-03-14 13:02:40 -04:00
Martin.B c0bedd2587
Add TRAINS experiment manager support (#1122)
* Add allegro.ai TRAINS experiment manager support

* improve docstring and type hinting, fix the bug in log_metrics, add support torch.Tensor to input into log_image

* complete missing docstring of constructor's arguments

* fix docs

* pep8

* pep8

* remove redundant typing
use logging
fix typing and pep8

* remove deprecated interface

* add TrainsLogger test

* add TrainsLogger PR in CHANGELOG

* add id/name property documentation

* change logging as log

Co-authored-by: bmartinn <>
Co-authored-by: Sou Uchida <s.aiueo32@gmail.com>
2020-03-14 13:02:14 -04:00
monney da61398835
Add Support for Non-primitive types in TensorboardLogger (#1130)
* Added support for non-primitive types to tensorboard logger

* added EOF newline

* PEP8

* Updated CHANGELOG for PR #1130. Moved _sanitize_params to base logger. Cleaned up _sanitize_params

* Updated CHANGELOG for PR #1130. Moved _sanitize_params to base logger. Cleaned up _sanitize_params

* changed convert_params to static method

* PEP8

* Cleanup Doctest for _sanitize_params

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Removed OrderedDict import

* Updated import order to conventions

Co-authored-by: Manbir Gulati <manbirgulati@Manbirs-MBP.hsd1.md.comcast.net>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-14 13:02:05 -04:00
Jirka Borovec 1d5f06223a
fix tmpdir (#1012)
* fix tmpdir

* just str path
2020-03-12 12:46:25 -04:00
Ethan Harris 2b3f443f6b
Add support for IterableDatasets everywhere (#1104)
* Add support for IterableDatasets everywhere

* Added type hints, simplified code and improved coverage in data_loading.py

* Update CHANGELOG.md
2020-03-12 12:46:02 -04:00
Jirka Borovec 514d182b7f
cleaning imports (#1032) 2020-03-12 12:41:37 -04:00
Jirka Borovec 4896815067
remove deprecated `data_loader` (#1077)
* change version in CHangelog

* warning

* remove der data_loader

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-06 16:11:05 -05:00
William Falcon 3d18099262
removed decorators (#1079) 2020-03-06 16:09:47 -05:00
Jirka Borovec ff1f8ef400 Test deprecated API for 0.8.0 and 0.9.0 (#1071)
* till 0.8

* refactor

* fix tests

* fix tests

* deprx till 0.9

* Update trainer.py

* Apply suggestions from code review

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-06 21:36:44 +01:00
William Falcon 0ebfb78570
Examples: using new API (#1056)
* using new API

* typo
2020-03-05 19:31:57 -05:00