Commit Graph

505 Commits

Author SHA1 Message Date
William Falcon 890458fdbd
Fixes automatic parser bug (#1585)
* fixes gpu parsing

* fixes gpu parsing
2020-04-23 21:00:41 -04:00
Adrian Wälchli 3e8f2d99a9
Progress bar callback (#1450)
* squash and rebase

sanity check hooks


sanity check callback hook finish


moved core progress bar functionality into callback


wip


remove duplicate merge


clean up


imports


docs


sanity check progress bar main


sanity


move callback calls


init progrss bar callback


configuration and docs


changelog


rate decorator


pass process_position


disable on rank > 0


position index


is_enabled


remove decorator


refactor init tqdm bars


callback method ordering 


cannot reset when disabled


sequence -> list


default values


fix has no attr _time() 


move on_val_end to proper place


fix the pickle issue


update warning


properties


check for None


remove old comment


switch order


pull out non-tqdm functionality into base class


documentation for the base class


docs


fix refresh rate issue in validation


restrict type hint of trainer arg


more docs


update trainer docs


rst docs


fix lines too long


fix test


add missing type hints


fix typo


move docstring to __init__ solves doctest failures


remove doctest :(( can't fix the pickle error


fix example


simplify by saving trainer reference


fix docs errors


move docstring


initial value


multiple val checks per epoch


simpler handling of inf dataset sizes


update inf docs


renamed training_tqdm_dict


rename get_tqdm_dict


rename occurences of tqdm 


update changelog


fix doctest


fix formatting errors


added callback tests


progress bar on off test


more tests for progress bar


weird test fix?


add ignored property


disable default progress bar in LR finder


change enable/disable behavior


trying doctest in CI again


undo doctest pickle error


undo doctest pickle error :((


remove progress_bar_callback Trainer arg and fix tests


restore progress bar after auto lr find


update docs


fix rebase


fix wrong negation

* fix fast dev run total

* more thorough testing

* remove old args

* fix merge

* fix merge

* separate tests

* type hint total batches

* reduce if

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_disabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_enabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* rename enabled/disabled

* move deprecated api

* remove duplicated test from merge

* fix rename is_disabled

* newline

* test also testprogress for fast dev run

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-23 20:46:18 -04:00
Guy Davidson fe2b6666e0
Fixing a small issue in trainer logging (#1563)
* The epoch was being logged to metrics, which isn't read, rather than to current_metrics.

* Updated the tests to account for the epoch arriving at the logger.
2020-04-23 17:52:41 -04:00
Jirka Borovec 7989ca844c
test deprecation warnings (#1470)
* check deprecation warnings

* extend warning test

* try

* unimport modules

* update
2020-04-23 17:34:47 -04:00
Jirka Borovec 0b22b64a10
Tests/docker (#1573)
* devel image

* try parallel

* new image
2020-04-23 12:52:59 -04:00
Nicki Skafte e977d1cde5
Default value for ModelCheckpoint filepath (#1548)
* allow determine of filepath at runtime

* typing

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-04-23 11:50:58 -04:00
Travis Addair 7024177f7d
Added Horovod distributed backend (#1529)
* Initial commit of Horovod distributed backend implementation

* Update distrib_data_parallel.py

* Update distrib_data_parallel.py

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Fixed tests

* Added six

* tests

* Install tox for GitHub CI

* Retry tests

* Catch all exceptions

* Skip cache

* Remove tox

* Restore pip cache

* Remove the cache

* Restore pip cache

* Remove AMP

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-22 17:39:08 -04:00
Jirka Borovec c1c6e3b6c9
default test logger (#1478)
* default test logger

* fix tests

* spawn

* try

* simplify tests

* simplify tests

* formatting

* loggers

* loggers

* revert to TestTube

* default

* default

* wraps

* world size

* optim imports
2020-04-21 20:33:10 -04:00
Jirka Borovec bd168819f2
fix changelog (#1452)
* fix changelog

* formatting

* add ddp_cpu

* docs

* add another
2020-04-20 17:36:26 -04:00
Adrian Wälchli 452fa858f4
skip warning test (#1533) 2020-04-20 08:04:37 +00:00
William Falcon ae2e14e3ed
fixed memory leak from opt return (#1528)
* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return
2020-04-19 16:41:54 -04:00
Adrian Wälchli 3c549e8ae3
Call on_before_zero_grad model hook (#1493)
* call on_before_zero_grad

* update changelog

* add note about overriding both hooks

* added test

* move test_hooks.py to models folder
2020-04-16 12:01:41 -04:00
Nic Eggert e3001a0929
Add ddp_cpu backend for testing ddp without GPUs (#1158)
* Add tests for distributed backend config

* Refactor set_distributed_mode

* Use gloo backend on cpu

* Use 127.0.0.1 instead of 127.0.0.2

Not totally clear on why this is necessary, but it seemt to work

* Update LightningDDP so that it works with CPU

* Add ddp_cpu backend and num_processes Trainer arg

* PEP8

* Fix test skipping. Inequalities are hard :/

* Skip ddp_cpu test on Windows

* Make a few more cases fall back to ddp_cpu

* New function name

* Flake8

* Don't test distributed on MacOS with torch < 1.3

Support for distributed in MacOS was added in Torch 1.3.0

* Add ddp_cpu and num_processes to docs

* Parametrize trainer config tests

* Tweak warning

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Remove redundant test

* Replace pass branches with comments

* Add missing warnings import

* save_path -> root_dir

* Use new rank_zero_warn

* Whitespace

* Apply suggestions from code review

* formatting

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-15 23:17:31 -04:00
William Falcon 3431c62d41
Remove error when test dataloader used in test (#1495)
* remove error when test dataloader used in test

* remove error when test dataloader used in test

* remove error when test dataloader used in test

* remove error when test dataloader used in test

* remove error when test dataloader used in test

* remove error when test dataloader used in test

* fix lost model reference

* remove error when test dataloader used in test

* fix lost model reference

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* added tests for warning

* fix lost model reference

* fix lost model reference

* added tests for warning

* added tests for warning

* refactoring

* refactoring

* fix imports

* refactoring

* fix imports

* refactoring

* fix tests

* fix mnist

* flake8

* review

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-15 22:16:40 -04:00
Jirka Borovec 8322f1b039
neptune online (#1499) 2020-04-15 11:14:29 -04:00
Jirka Borovec b3fe17ddeb
fix flushing loggers (#1459)
* flushing loggers

* flushing loggers

* flushing loggers

* flushing loggers

* changelog

* typo

* fix trains

* optimize imports

* add logger test all

* add logger test pickle

* flake8

* fix benchmark

* hanging loggers

* try

* del

* all

* cleaning
2020-04-14 20:32:33 -04:00
William Falcon c96c6a6b33
attempting to remove some speed issues (#1482)
* removed some .items

* added speed tests

* added speed tests

* Update benchmarks/test_rnn_parity.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update benchmarks/test_trainer_parity.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* fix lost model reference

* added speed tests

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-14 20:23:36 -04:00
Ethan Harris 8544b334e4
Replace automatic nan check with optional flag (#1475)
* Replace automatic nan check with optional flag

* Update CHANGELOG.md
2020-04-13 14:06:25 -04:00
Nicki Skafte 3f09b32df3
Learning Rate finder (#1347)
* initial structure

* rebase

* incorporate suggestions

* update CHANGELOG.md

* initial docs

* fixes based on reviews

* added trainer arg

* update docs

* added saving/restore of model state

* initial tests

* fix styling

* added more tests

* fix docs, backward compatility and progressbar

* fix styling

* docs update

* updates based on review

* changed saving to standard functions

* consistent naming

* fix formatting

* improve docs, added support for nested fields, improve codecov

* update CHANGELOG.md

* Update lr_finder.rst

* Update pytorch_lightning/trainer/trainer.py

* Update trainer.py

* Update CHANGELOG.md

* Update path

* restoring

* test

* attribs

* docs

* doc typo

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-10 14:34:23 -04:00
Jirka Borovec d05ac813dc
fix deprecated default_save_path (#1449) 2020-04-10 14:32:56 -04:00
William Falcon b78c3d4da8
Fix weights path (#1445)
* renamed default path to actual root_dir

* added default weights path

* added default weights path

* added default weights path
2020-04-10 12:02:59 -04:00
Allard Hendriksen 7ac1580a31
Add automatic GPU choice to trainer (#1426)
* Add automatic GPU choice to trainer

This commit adds the `gpu_choice` parameter to Trainer. By default,
this parameter is set to 'manual' which causes no observable
difference in behavior.

When `gpu_choice` is set to "auto" and `gpus` is an int, then the
trainer will automatically allocate the first available GPU.
This is especially useful when GPUs are configured to be in "exclusive
mode", which means that only one process at a time can use them.

* Rename gpu_choice -> auto_select_gpus
2020-04-10 11:45:29 -04:00
Rohit Gupta e79ae18cae
Add test_dataloaders to test method (#1434)
* Add test_dataloaders to test method

* Remove test_dataloaders from .fit()

* Fix code comment

* Fix tests

* Add test_dataloaders to test method (#1393)

* Fix failing tests

* Update docs (#1393)
2020-04-10 11:44:03 -04:00
Alexey Karnachev 4c34d16a34
Fixed configure optimizer from dict without "scheduler" key (#1443)
* `configure_optimizer` from dict with only "optimizer" key. bug fixed

* autopep8

* pep8speaks suggested fixes

* CHANGELOG.md upd
2020-04-10 11:43:06 -04:00
Alex Sergeev 8dd9b80d7a
Fix gradient clipping (#1438)
* Fix gradient clipping

* Relax accuracy constraint
2020-04-09 21:08:28 -04:00
Jirka Borovec 17f58d2e11
add rank warning (#1428)
* add rank warning

* changelog

* use rank_zero_warn

* user trainer_init

* replace warnings

* fix test

* flake8

* docs

* changelog

* bug lol
2020-04-09 14:05:46 -04:00
Alexey Karnachev ddbf7de6dc
Added accumulation of loggers' metrics for the same steps (#1278)
* `add_argparse_args` method fixed (argument types added)

* autopep8 fixes

* --gpus=0 removed from test (for ci tests)

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Joe Davison <joe@huggingface.co>

* test_with_accumulate_grad_batches added

* agg_and_log_metrics logic added to the base logger class

* small format fix

* agg metrics strategies removed (not to complicate stuff)

* agg metrics: handle zero step

* autopep8

* changelog upd

* flake fix

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove .item which causes sync issues (#1254)

* remove .item which causes sync issues

* fixed gradient acc sched

* fixed gradient acc sched

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* autopep8

* loggers base.py types fixed

* test

* test

* metrics aggregation for loggers: each key now has a specific function (or default one)

* metrics aggregation for loggers: each key now has a specific function (or default one)

* docstrings upd

* manual typehints removed from docstrings

* batch_size decreased for test `test_with_accumulate_grad_batches`

* extend running accum

* refactor

* fix tests

* fix tests

* allowed_types generator scoped

* trainer.py distutils was imported twice, fixed

* TensorRunningAccum refactored

* TensorRunningAccum added to change log (Changed)

* change log pull link added

Co-authored-by: Joe Davison <joe@huggingface.co>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-08 08:35:47 -04:00
Jeremy Jordan 91c9b29d47
add trainer attribute to denote if interrupted (#1368)
* add trainer attribute to denote if interrupted

* bugfix and formatting
2020-04-05 11:12:41 -04:00
Ethan Harris b18accc64c
Add warning for few workers (#1378)
* Add warning for few workers

* Fix style issue

* Update CHANGELOG.md

* Update test

* formatting

* formatting

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-05 11:07:16 -04:00
Justus Schock f6a86e8551
generalize reinstantiation of dataloader (#1346)
* generalize reinstantiation of dataloader

* fix condition

* add test

* update changelog

* fix changelog

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-03 17:55:08 -04:00
William Falcon 3c5530c29d
Wandb bug/wandb multi (#1360)
* Allow reinits in sub procs

* Dont create an experiment on pickle, name, or project

* Comments consistency

* Fix test

* Apply suggestions from code review

Co-authored-by: Chris Van Pelt <vanpelt@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-03 15:03:00 -04:00
William Falcon dd5a05926c
Borisdayma: fix(wandb) - fix watch method (#1361)
* fix(wandb): fix watch method

* rebased

* Apply suggestions from code review

Co-authored-by: Boris Dayma <boris.dayma@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-03 15:02:38 -04:00
Adrian Wälchli ebd9fc9530
Fix for incorrect run on the validation set with overwritten validation_epoch_end and test_end (#1353)
* reorder if clauses

* fix wrong method overload in test

* fix formatting

* update change_log

* fix line too long
2020-04-03 09:25:32 -04:00
Jean-Baptiste SCHIRATTI 868b172f05
Make training_epoch_end behave like validation_epoch_end (#1357)
* Make training_epoch_end behave like validation_epoch_end + minor fixes in docstrings.

* Minor fixes (Borda's comments).

* Detach tensors in batch_output (to avoid possible memory leak) + doc fix.

Co-authored-by: Jean-Baptiste SCHIRATTI <jean-baptisteschiratti@MacBook-Pro-de-Jean-Baptiste.local>
2020-04-03 14:43:26 +02:00
Gerard Bentley f33b5a8d99
Simplify progress bar args (#1108)
* show progress bar dependent on refresh_rate

* test progress_bar_refresh control show bar

* remove show_progress_bar from other tests

* borda fixes

* flake8 fix

* changelog update prog bar refresh rate

* move show_progress_bar to deprecated 0.9 api

* rm show_progress_bar references, test deprecated

* Update pytorch_lightning/trainer/__init__.py

* fix test

* changelog

* minor CHANGELOG.md format

* Update pytorch_lightning/trainer/__init__.py

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Gerard Bentley <gbkh2015@mymail.pomona.edu>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-03 00:53:00 +02:00
Jirka Borovec 724b787cd1
faster CI testing (#1323)
* MNIST digits

* increase test acc

* smaller parity

* drone builds

* increase GH action timeout

* drone format

* fix paths

* drone cache

* circle cache

* fix test

* lower nb epochs

* circleCI

* user orb

* fix test

* fix test

* circle cache

* circle cache

* circle cache

* comment caches

* benchmark batch size

* cache dataset

* smaller dataset

* smaller dataset

* fix nb samples

* batch size

* fix test
2020-04-02 12:28:44 -04:00
Nicki Skafte 2912239fe6
Add useful errors when model is not configured correctly (#1199)
* add check_model_configuration method

* trying to fix errors

* trying to fix tests

* added test_epoch_end to lightning template

* fix tests

* fix new test after rebase

* fix spelling

* added more checks

* updated formating

* added tests

* fixed CHANGELOG

* Apply suggestions from code review

* move test to new module

* change check on configure_optimizers

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-02 11:53:37 -04:00
Ethan Harris 28242f02d1
Remove default optimizer, add None optimizer option (#1279)
* Add warning when using default optimizer

* Refactor optimizer tests to test_optimizers

* Remove default optimizer, add option to use no optimizer

* Update CHANGELOG.md

* Update pytorch_lightning/trainer/optimizers.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Fix style

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-02 11:48:53 -04:00
Asaf Manor aca8c7e6f3
Optimizer Frequencies logic, and new configure_optimizers (#1269)
* init_optimizers accepts Dict, Sequence[Dict]
and returns optimizer_frequencies.
optimizer_frequencies was added as a member of Trainer.

* Optimizer frequencies logic implemented in training_loop.
Description added to configure_optimizers in LightningModule

* optimizer frequencies tests added to test_gpu

* Fixed formatting for merging PR #1269

* Apply suggestions from code review

* Apply suggestions from code review

Co-Authored-By: Asaf Manor <32155911+asafmanor@users.noreply.github.com>

* Update trainer.py

* Moving get_optimizers_iterable() outside.

* Update note

* Apply suggestions from code review

* formatting

* formatting

* Update CHANGELOG.md

* formatting

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-31 16:41:24 +00:00
Adrian Wälchli d6646e151a
Move some tests to correct subfolder/file (#1312)
* move some tests to trainer file

* fix imports
2020-03-31 08:58:46 -04:00
Jirka Borovec 6ddb03922a
Profiler summary (#1259)
* refactor and add types

* add Prorfiler summary

* fix imports

* Revert "refactor and add types"

This reverts commit b4c552fa

* changelog

* revert rename

* fix test

* mute verbose
2020-03-31 08:57:48 -04:00
Adrian Wälchli 1aba411da9
Early stopping when validation is disabled (#1235)
* early stop fallback to train epoch

* added test

* fix imports

* update docs

* update changelog

* fix typo
2020-03-31 06:24:26 +00:00
Bilal Khan a707d4bea1
Replace Wandb callback's finalize with no-op (#1193)
* Replace Wandb callback's finalize with no-op

* Update pytorch_lightning/loggers/wandb.py

* Update wandb.py

* remove wandb logger's finalize and update tests

* update changelog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-30 18:45:06 -04:00
Nicki Skafte 2ccc7456ca
Error on zero length dataloaders (#1280)
* error_on_zero_length

* update CHANGELOG.md

* added test

* Update pytorch_lightning/trainer/data_loading.py

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-30 18:43:53 -04:00
Jirka Borovec 09167efdb5
Checkpointing interval (#1272)
* formatting

* formatting

* fix interval

* fix train loop

* fix test

* parametrize test

* Apply suggestions from code review

Co-Authored-By: Adrian Wälchli <adrian.waelchli@students.unibe.ch>

* fix calling

* flake8

* add types

Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-30 18:37:02 -04:00
Jirka Borovec 2ca5356429
clear skipping tests (#1285)
* clear skipping tests

* fix simple/multi GPU

* review: simplify
2020-03-30 18:29:23 -04:00
Jirka Borovec 31017120fd
fix incomplete RunningMean (#1309)
* fix RunningMean

* changelog

* fix none

* Update supporters.py

just needed to multiply by zero for init

* Revert "Update supporters.py"

This reverts commit 7e0da6c6

* fix NaN

* formatting

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-30 18:28:31 -04:00
Adrian Wälchli b7de42f70d
Add MNIST dataset & drop torchvision dep. from tests (#986)
* added custom mnist without torchvision dep

* move files so it does not conflict with mnist gitignore

* mock torchvision for tests

* fix line too long

* fix line too long

* fix "module level import not at top of file" warning

* move mock imports to __init__.py

* simplify MNIST a lot and download directly the .pt files

* further simplify and clean up mnist

* revert import overrides

* make as before

* drop  PIL requirement

* move mnist.py to datasets subfolder

* use logging instead of print

* choose same name as in torchvision

* remove torchvision and pillow also from yml file

* refactor if train

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* capitalized class attr

* moved mnist to models

* re-added datsets ignore

* better name for file variable

* Update mnist.py

* move dataset classes to datasets.py

* new line

* update

* update

* fix automerge

* move to base folder

* adapt testingmnist to new mnist base class

* remove temporal fix

* fix datatype

* remove old testingmnist

* readable

* fix import

* fix whitespace

* docstring

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/base/datasets.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* changelog

* added types

* Update CHANGELOG.md

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* exist->isfile

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* index -> idx

* temporary fix for trains error

* better changelog message

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-30 18:25:37 -04:00
Jirka Borovec c869dd8b8f
make evaluate private (#1260)
* make evaluate private

* changelog
2020-03-30 12:14:27 -04:00
Ethan Harris ab09faa15e
Add support for iterable datasets when val_check_interval=1.0 (#1283)
* Add support for iterable datasets when val_check_interval=1.0

* Update CHANGELOG.md
2020-03-29 15:27:44 -04:00
Jeremy Jordan 54507f417e
fix logging config and add profiler test (#1267) 2020-03-29 14:56:36 -04:00
Jirka Borovec 61177cd1c8
system info (#1234)
* system info

* update big info

* test script

* update config

* rename script

* import path
2020-03-27 08:45:52 -04:00
Tyler Yep 6772e0c197
Remove unnecessary parameters to super() in documentation and source code (#1240)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-27 12:36:50 +00:00
Jeremy Jordan d394b80ac8
calling self.forward() -> self() (#1211)
* self.forward() -> self()

* update changelog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-27 08:17:56 +01:00
Adrian Wälchli 2a4cd479e2
Disable validation when val_percent_check=0 (#1251)
* fix disable validation

* add test

* update changelog

* update docs for val_percent_check

* make "fast training" docs consistent
2020-03-27 02:07:22 +00:00
Jirka Borovec 45d671a4a8
CI: split tests-examples (#990)
* CI: split tests-examples

* tests without template

* comment depends

* CircleCI typo

* add doctest

* update test req.

* CI tests

* setup macOS

* longer train

* lover pred acc

* fix model

* rename default model

* lower tests acc

* typo

* imports

* fix test optimizer

* update calls

* fix Win

* lower Drone image

* fix call

* pytorch image

* fix test

* add dev image

* add dev image

* update image

* drone volume

* lint

* update test notes

* rename tests/models >> tests/base

* group models

* conftest

* optim imports

* typos

* fix import

* fix tests

* install AMP

* tests

* fix import
2020-03-25 07:46:27 -04:00
Alexey Karnachev ced662fc27
Custom argparser extension with Trainer arguments (argument types added) (#1147)
* `add_argparse_args` method fixed (argument types added)

* CHANGELOG.md upd

* autopep8 fixes

* --gpus=0 removed from test (for ci tests)

* typo fixed

* reduce on plateau scheduler fixed

* Trainer cli related tests moved to test_trainer_cli.py

* refactored: get_init_arguments_and_types is a public classmethod of the Trainer now

* test_get_init_arguments_and_types added

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* refactored: get_init_arguments_and_types is a public classmethod of the Trainer now

* test_get_init_arguments_and_types added

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* refactored: get_init_arguments_and_types is a public classmethod of the Trainer now

* test_get_init_arguments_and_types added

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* test_get_init_arguments_and_types added

* autopep8 fixes

* Apply suggestions from code review

* cosmetics

* cosmetics

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* `Trainer.get_init_arguments_and_types` now returns arg types wrapped in tuples (not in sets)

* deprecated args are now ignored in argparser

* get_deprecated_arg_names small refactor

* get_deprecated_arg_names bug fixed

* Trainer cli related tests moved to test_trainer_cli.py

* refactored: get_init_arguments_and_types is a public classmethod of the Trainer now

* test_get_init_arguments_and_types added

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* autopep8 fixes

* Trainer cli related tests moved to test_trainer_cli.py

* Trainer cli related tests moved to test_trainer_cli.py

* test_get_init_arguments_and_types added

* autopep8 fixes

* autopep8 fixes

* Apply suggestions from code review

* cosmetics

* cosmetics

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* `Trainer.get_init_arguments_and_types` now returns arg types wrapped in tuples (not in sets)

* deprecated args are now ignored in argparser

* get_deprecated_arg_names small refactor

* get_deprecated_arg_names bug fixed

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Joe Davison <joe@huggingface.co>

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Joe Davison <joe@huggingface.co>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Joe Davison <joe@huggingface.co>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-24 14:55:27 -04:00
Jeremy Jordan 4c2026bf9a
increase profiler test coverage (#1208)
* increase profiler test coverage

* fix line length

* tests for valueerror assertions
2020-03-24 09:15:16 -04:00
Jirka Borovec 3be81cb54e
test deprecated - model (#1074)
* pylint

* model API

* update test

* formatting

* disable logger

* fix checking overwrite

* fix test

* typo

* deprecated model

* fix for DDP

* drop Flake8 in GH actions

* Update pytorch_lightning/trainer/evaluation_loop.py

* fix imports

Co-authored-by: Nic Eggert <nic@eggert.io>
2020-03-20 20:51:14 +01:00
Adrian Wälchli 732eaee4d7
nan detection and intervention (#1097)
* check for nan values

* test nan detection on loss

* sys.exit

* whitespace

* detect nan and inf values in loss and params

* update

* added documentation

* moved detect nan to training loop, remove flag for print

* blank line

* test

* rename

* deprecate print_nan_grads

* deprecated print_nan_grads

* remove unused imports

* update changelog

* fix line too long

* correct deprecated version

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* raise exception instead of sysexit

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* raise exception instead of sysexit

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/training_tricks.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/training_tricks.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* fix test

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-19 09:24:45 -04:00
So Uchida 01b8991c5a
Support hierarchical dict (#1152)
* Add support for hierarchical dict

* Support nested Namespace

* Add docstring

* Migrate hparam flattening to each logger

* Modify URLs in CHANGELOG

* typo

* Simplify the conditional branch about Namespace

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update CHANGELOG.md

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* added examples section to docstring

* renamed _dict -> input_dict

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-19 09:15:47 -04:00
Jirka Borovec 22a7264e9a
improve partial Codecov (#1172)
* ignore in setup

* show report

* abs imports

* abstract pass

* cover loggers

* doctest trains

* locals

* pass

* revert tensorboard

* use tensorboardX

* revert tensorboardX

* fix trains

* Add TrainsLogger.set_credentials (#1179)

* Add TrainsLogger.set_credentials to control trains server configuration and authentication from code. Sync trains package version.
Fix CI Trains tests

* Add global TrainsLogger set_bypass_mode (#1187)

* Add global TrainsLogger set_bypass_mode skips all external communication

Co-authored-by: bmartinn <>

* rm some no-cov

Co-authored-by: Martin.B <51887611+bmartinn@users.noreply.github.com>
2020-03-19 09:14:29 -04:00
Nicki Skafte 384e124490
ReduceLROnPlateau bug fix (#1126)
* bug fix and test

* update CHANGELOG.md

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-03-16 14:35:10 -04:00
Jakub 3ad6169f18
Neptune Logger Improvements (#1084)
* removed project and experiment from getstate

* added tests for closing experiment, updated token in example to user neptuner

* updated teoken

* Update neptune.py

added a link to example experiment

* added exmaple experiment link

* dropped duplication

* flake fixes

* merged with master, added changes information to CHANGELOG
2020-03-14 13:02:40 -04:00
Martin.B c0bedd2587
Add TRAINS experiment manager support (#1122)
* Add allegro.ai TRAINS experiment manager support

* improve docstring and type hinting, fix the bug in log_metrics, add support torch.Tensor to input into log_image

* complete missing docstring of constructor's arguments

* fix docs

* pep8

* pep8

* remove redundant typing
use logging
fix typing and pep8

* remove deprecated interface

* add TrainsLogger test

* add TrainsLogger PR in CHANGELOG

* add id/name property documentation

* change logging as log

Co-authored-by: bmartinn <>
Co-authored-by: Sou Uchida <s.aiueo32@gmail.com>
2020-03-14 13:02:14 -04:00
monney da61398835
Add Support for Non-primitive types in TensorboardLogger (#1130)
* Added support for non-primitive types to tensorboard logger

* added EOF newline

* PEP8

* Updated CHANGELOG for PR #1130. Moved _sanitize_params to base logger. Cleaned up _sanitize_params

* Updated CHANGELOG for PR #1130. Moved _sanitize_params to base logger. Cleaned up _sanitize_params

* changed convert_params to static method

* PEP8

* Cleanup Doctest for _sanitize_params

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Removed OrderedDict import

* Updated import order to conventions

Co-authored-by: Manbir Gulati <manbirgulati@Manbirs-MBP.hsd1.md.comcast.net>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-14 13:02:05 -04:00
Jirka Borovec 1d5f06223a
fix tmpdir (#1012)
* fix tmpdir

* just str path
2020-03-12 12:46:25 -04:00
Ethan Harris 2b3f443f6b
Add support for IterableDatasets everywhere (#1104)
* Add support for IterableDatasets everywhere

* Added type hints, simplified code and improved coverage in data_loading.py

* Update CHANGELOG.md
2020-03-12 12:46:02 -04:00
Jirka Borovec 514d182b7f
cleaning imports (#1032) 2020-03-12 12:41:37 -04:00
Jirka Borovec 4896815067
remove deprecated `data_loader` (#1077)
* change version in CHangelog

* warning

* remove der data_loader

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-06 16:11:05 -05:00
William Falcon 3d18099262
removed decorators (#1079) 2020-03-06 16:09:47 -05:00
Jirka Borovec ff1f8ef400 Test deprecated API for 0.8.0 and 0.9.0 (#1071)
* till 0.8

* refactor

* fix tests

* fix tests

* deprx till 0.9

* Update trainer.py

* Apply suggestions from code review

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-06 21:36:44 +01:00
William Falcon 0ebfb78570
Examples: using new API (#1056)
* using new API

* typo
2020-03-05 19:31:57 -05:00
William Falcon 969e929a48
Learning rate stepping option (#941)
* remove deprecated args to learning rate step function

* step based scheduler

* mixing models for testing

* fix styling

* tests

* update documentation

* smaller fix

* update to dict structure

* updated test

* update documentation

* update CHANGELOG.md

* fix styling

* fix problems with trainer io

* fix tests

* simplification of code

* fix styling

* change from batch to step

* update to tests

* fix styling

* fixed some logic

* Update pytorch_lightning/core/lightning.py

* duplicated test

* fix test on amp

* small update to tests

* added monitor key for ReduceLROnPlateau

* Update trainer.py

* Update training_loop.py

* fix test after introducing monitor keyword

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-05 06:48:54 -05:00
William Falcon bcb45d906d
proper checkpoint implementation (#1043)
* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* enabled early stopping/checkpooiunt even  without val step

* name formatting

* version

* testing

* add test

* fix test

* Update model_checkpoint.py

* doctests

* pylint

* tests

* debug

* debug

* enabled early stopping/checkpooiunt even  without val step

* fix MNIST download (#1044)

* fix MNIST download

* simple

* name formatting

* version

* testing

* add test

* fix test

* doctests

* tests

* debug

* debug

* rebased 1041

* rebased 1041

* tests

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

* rebased 1041

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-04 23:02:19 -05:00
William Falcon 165b9fb3f3
fix MNIST download (#1044)
* fix MNIST download

* simple
2020-03-04 17:57:26 -05:00
Jirka Borovec e586ed4767
hparams as dict [blocked by 1041] (#1029)
* hparams as dict

* hparams as dict

* fixing

* fixing

* fixing

* fixing

* typing

* typing

* chnagelog

* update set hparams

* use setter

* simplify

* chnagelog

* imports

* pylint

* typing

* Update training_io.py

* Update training_io.py

* Update lightning.py

* Update test_trainer.py

* Update __init__.py

* Update base.py

* Update utils.py

* Update test_trainer.py

* Update training_io.py

* Update test_trainer.py

* Update test_trainer.py

* Update test_trainer.py

* Update test_trainer.py

* Update callback_config.py

* Update callback_config.py

* Update test_trainer.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-04 09:33:39 -05:00
Jirka Borovec 64de57b09e
update checkpoint docs (#1016)
* update checkpoint docs

* fix tests

* fix tests

* formatting

* typing

* filename

* fix tests

* fixing tests

* fixing tests

* fixing tests

* unique name

* fixing

* fixing

* Update model_checkpoint.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-03 15:16:57 -05:00
William Falcon 4c5e82c065
Skepticleo trainer argparser (#1023)
* Added default parser for trainer and class method to construct trainer from default args

* Removed print statement

* Added test for constructing Trainer from command line args

* Removed extra line

* Removed redundant imports, removed whitespace from empty lines

* Fixed typo

* Updated default parser creation to get class attributes automatically

* Updated default parser creation to get class attributes automatically

* Added method to get default args for trainer

* Trimmed trainer get default args method

* Updated from argparse method to not return trainer with static arguments

* Update trainer get default args to classmethod

* adjustment

* fix

* Fixed variable name

* Update trainer.py

* Update test_trainer.py

* Update trainer.py

* Update tests/trainer/test_trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update trainer.py

* Update test_trainer.py

* Update trainer.py

* Update test_trainer.py

* Update tests/trainer/test_trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update trainer.py

* Update test_trainer.py

Co-authored-by: Mudit Tanwani <mudittanwani@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-03 09:32:15 -05:00
Jeremy Jordan 705e576417
consolidate callbacks and hooks (#950)
* consolidate callbacks and hooks

* ensure callbacks recieve proper arg types

* remove model from init callback events

* clean up early stopping event

* update changelog

* remove on_fit_start and on_fit_end

* fix args for on_init_start and on_init_end

* handle case where early stopping is not used

* show all callback methods

* wrap checkpoint callback logic into proper class

* fix check for main process in checkpoint callback

* move callbacks test to separate file

* refactor arg checks

* get model and call hook on same line

* define trainer_options dict in one call

* add more asserts to callback test
2020-03-02 23:51:32 -05:00
Adrian Wälchli 5458d05cd8
Merge load functions (#995)
* Update README.md

* Update README.md

* Use callable object for patching dataloaders (#971)

* Use callable object for patching dataloaders

* Add test for ddp with dataloaders passed to fit()

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* merge load functions

* update tests

* fix documentation warnings

* fix line too long

* fix line too long

* print deprecation warning

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* move tags_csv argument to end of signature

* fix typo, update version numbers

* fix line too long

* add typing as requested

* update changelog

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Sho Arora <sho854@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-02 21:05:38 -05:00
Ethan Harris f862d9f691
Logger tests and fixes (#1009)
* Refactor logger tests

* Update and add tests for wandb logger

* Update and add tests for logger bases

* Update and add tests for mlflow logger

* Improve coverage

* Updates

* Update CHANGELOG

* Updates

* Fix style

* Fix style

* Updates
2020-03-02 20:49:14 -05:00
William Falcon 2a04be0386
No auto load weights (#985)
* remove autoload

* remove autoload

* added weights loading docs

* checkpoint loading saving docs

* checkpoint loading saving docs

* checkpoint loading saving docs

* docs (#1010)

* remove autoload

* remove autoload

* added weights loading docs

* checkpoint loading saving docs

* checkpoint loading saving docs

* checkpoint loading saving docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs

* docs
2020-03-02 17:12:22 -05:00
Sho Arora d69455a466 Use callable object for patching dataloaders (#971)
* Use callable object for patching dataloaders

* Add test for ddp with dataloaders passed to fit()

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-02 17:11:54 +01:00
William Falcon ad80a7d638
clean docs (#967)
* clean docs

* clean docs

* clean docs
2020-02-27 17:21:51 -05:00
Jirka Borovec 7beed7cae6
Trainer cleanup (#934)
* Trainer cleanup

* update abstract

* remove ...

* remove __init__

* update mixin types

* update callbacks

* fix

* lower test acc
2020-02-27 16:21:14 -05:00
Hanbyul Kim 563e2ba2c6
resolving documentation warnings (#833)
* add more underline

* fix LightningMudule import error

* remove unneeded blank line

* escape asterisk to fix inline emphasis warning

* add PULL_REQUEST_TEMPLATE.md

* add __init__.py and import imagenet_example

* fix duplicate label

* add noindex option to fix duplicate object warnings

* remove unexpected indent

* refer explicit LightningModule

* fix minor bug

* refer EarlyStopping explicitly

* restore exclude patterns

* change the way how to refer class

* remove unused import

* update badges & drop Travis/Appveyor (#826)

* drop Travis

* drop Appveyor

* update badges

* fix missing PyPI images & CI badges (#853)

* docs - anchor links (#848)

* docs - add links

* add desc.

* add Greeting action (#843)

* add Greeting action

* Update greetings.yml

Co-authored-by: William Falcon <waf2107@columbia.edu>

* add pep8speaks (#842)

* advanced profiler describe + cleaned up tests (#837)

* add py36 compatibility

* add test case to capture previous bug

* clean up tests

* clean up tests

* Update lightning_module_template.py

* Update lightning.py

* respond lint issues

* break long line

* break more lines

* checkout conflicting files from master

* shorten url

* checkout from upstream/master

* remove trailing whitespaces

* remove unused import LightningModule

* fix sphinx bot warnings

* Apply suggestions from code review

just to trigger CI

* Update .github/workflows/greetings.yml

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-02-27 16:07:51 -05:00
Jirka Borovec d856989120
split trainer tests (#956)
* split trainer tests

* Apply suggestions from code review

* format string

* add CI timeout
2020-02-26 20:31:40 -05:00
William Falcon f86dd55145
fixes tpu data loader bug (#957)
* fixes tpu data loader bug

* fixes tpu data loader bug
2020-02-26 19:29:03 -05:00
Ethan Harris b2e9607362
Refactor dataloading (#955)
* Refactor dataloading

* Refactor dataloading

* Refactor dataloading

* Add shuffle to test
2020-02-26 16:55:18 -05:00
Hadrien Mary be244560b2
Callbacks [wip] (#889)
* Add callback system + associated test

* Add trainer and pl_module args to callback methods

* typing

* typo in docstring

* Switch to on_.*_start()

* fix on_test_start

* fix the mess after rebasing
2020-02-25 23:17:27 -05:00
Ir1dXD be83e7515b
feat(trainer): add enable_benchmarking option (#803)
* feat(trainer): add enable_benchmarking option

closes #370

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* add test

* try to make the lint work

* fix typo

* add test, verify torch.backends.cudnn.benchmark

* make lint happy

* make lint happy

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-25 15:05:41 -05:00
Ethan Harris a5f159b2c7
Add support for multiple loggers (#903)
* Add support for multiple loggers

* Fix PEP

* Cleanup

* Cleanup

* Add typing to loggers

* Update base.py

* Replace duck typing with isinstance check

* Update CHANGELOG.md

* Update comet experiment type, Switch to abstractmethod in logging.py

* Fix test

* Add passes to LightningLoggerBase

* Update experiment_logging.rst
2020-02-25 14:52:39 -05:00
Jirka Borovec 5dd2afeab1
Fixing tests (#936)
* abs import

* rename test model

* update trainer

* revert test_step check

* move tags

* fix test_step

* clean tests

* fix template

* update dataset path

* fix parent order
2020-02-25 13:06:24 -05:00
Adrian Wälchli 20d15c8023
relax hparams (#919)
relax model loading hparams


test wip


wip


fix warning


finish test


remove unused import
2020-02-25 10:36:44 -05:00
Chirag Raman 4d36e76cbc
Update tests README to point to tests/requirements.txt (#935)
* Update tests README

Point to tests/requirements.txt as part of instructions

* Update `requirements` to `dependencies`
2020-02-25 09:45:34 -05:00
William Falcon ceec51d96c
fix tests (#938)
* fix tests

* fix tests
2020-02-25 08:53:33 -05:00
Matt Painter 6b667b1237
Fix/test pass overrides (#918)
* Fix test requiring both test_step and test_end

* Add test

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-24 22:33:11 -05:00
William Falcon 1015a00506
Clean up dataloader logic (#926)
* added get dataloaders directly using a getter

* deleted decorator

* added prepare_data hook

* refactored dataloader init

* refactored dataloader init

* added dataloader reset flag and main loop

* added dataloader reset flag and main loop

* added dataloader reset flag and main loop

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixes #909

* fixes #909

* bug fix

* Fixes #902
2020-02-24 22:23:25 -05:00
Matt Painter 6e7dc9c236
Fixes resuming checkpoints rerunning last epoch (#866)
* Properly restore current epoch and global step on resume

* Add test

* Move increment to saving rather than loading

* Fix other tests that refer to current epoch

* Formatting

* Add warning for mid-epoch resuming

* Formatting

* Fix warning check for accumulated batches

* Add variable to init

* Formatting

* Add check for 0 training steps

* Make check more readable
2020-02-21 20:27:19 -05:00
Aljoscha Steffens 9eb1907151
separate requirements for logger dependencies (#792)
* added file that contains information on the minimal versions needed for the supported loggers

* copied minimal version, combined files, deleted duplicates

* sorted functions in tests/test_loggers.py to be consistent

* expanded wandb logging test; added minimal versions for requirements-extra.txt; increased the amount of training data that is used for tests

* formatting

* added requirements-extra.txt to MANIFEST.in

* reverted wandb test; ensured minimal version for dependencies in requirements-extra.txt in ci-testing.yml
2020-02-21 13:30:27 -05:00
Jeremy Jordan ea8878bc14
clean up tests/test_profiler.py (#867)
* cleanup docstrings, _get_total_cprofile_duration in module

* relax profiler overhead tolerance
2020-02-19 07:09:28 -05:00
Nicki Skafte ffd6e693de
new way of passing dataloaders (#759)
* new way of passing dataloaders

* fixed docs

* fixed codestyle to follow flake8

* allow val/test be list of dataloaders and smarter checking

* added test

* fix flake error

* fix linking to new test model

* split into multiple test

* fix naming and typo

* minor documentation changes

* remove random file

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* better error/warning message

* final adjustments

* update CHANGELOG.md

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-19 06:00:08 -05:00
Peter Izsak 054a35312d
Added max number of steps in Trainer (#728)
* Added max number of steps in Trainer

* Added docstring

* Fix flake8 errors

* Clarified docstrings

* Fixed flake8 error

* Added min_steps to Trainer

* Added steps and epochs test

* flake8

* minor fix

* fix steps test in test_trainer

* Split steps test into 2 tests

* Refactor steps test

* Update test_trainer.py

* Minor in test_trainer.py

* Update test_trainer.py

* Address PR comments

* Minor

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-18 11:23:22 -05:00
William Falcon d4a31f02e0
Enable TPU support (#868)
* added tpu docs

* added tpu flags

* add tpu docs + init training call

* amp

* amp

* amp

* amp

* optimizer step

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* fix test pkg create (#873)

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Luis Capelo <luiscape@gmail.com>

* Fix segmentation example (#876)

* removed torchvision model and added custom model

* minor fix

* Fixed relative imports issue

* Fix/typo (#880)

* Update greetings.yml

* Update greetings.yml

* Changelog (#869)

* Create CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update PULL_REQUEST_TEMPLATE.md

* Update PULL_REQUEST_TEMPLATE.md

* Add PR links to Version 0.6.0 in CHANGELOG.md

* Add PR links for Unreleased in CHANGELOG.md

* Update PULL_REQUEST_TEMPLATE.md

* Fixing Function Signatures (#871)

* added tpu docs

* added tpu flags

* add tpu docs + init training call

* amp

* amp

* amp

* amp

* optimizer step

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Luis Capelo <luiscape@gmail.com>
Co-authored-by: Akshay Kulkarni <akshayk.vnit@gmail.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Shikhar Chauhan <xssChauhan@users.noreply.github.com>
2020-02-17 16:01:20 -05:00
Vadim Bereznyuk edd4a87fb0
Refactor callbacks (#776)
* Refactor callbacks

* flake8

* Update docstrings

* Simplified callback, protected trainer

* .set_trainer() check

* update docs

* missed super().__ini__()

* Updated tests

* Use uppercase

* refine checkpoint callback tests

* Added test_begin() and test_end()
2020-02-16 00:03:05 -05:00
Jeremy Jordan 4ae31cd1d5
advanced profiler describe + cleaned up tests (#837)
* add py36 compatibility

* add test case to capture previous bug

* clean up tests

* clean up tests
2020-02-15 23:43:43 -05:00
Dmitry Lipin 06ca6428b6
Allow user to specify 'step' key while logging metrics (#808)
* allow to specify 'step' key

* add test

* docs to log_metrics

* fix test

* rename

* also rename
2020-02-15 23:35:23 -05:00
Jirka Borovec 9f939447f2
add autopep8 to Contributions guide (#852)
* add autopep8 to Contrib.

* simplify cmd

* update GH templates

* add pytest-flake8

* update GH template
2020-02-15 20:24:38 -05:00
Jirka Borovec af44583050
drop torchvision, tests only (#797)
* drop torchvision, tests only

* manifest

* move test utils
2020-02-10 22:47:18 -05:00
Bob Kemp 8fa802e35b
Tensorboard path generalisation (#804)
* Allow experiment versions to be overridden by passing a string value.
Allow experiment names to be empty, in which case no per-experiment subdirectory will be created and checkpoints will be saved in the directory given by the save_dir parameter.

* Document tensorboard api changes

* Review comment fixes plus fixed test failure for minimum requirements build

* More format fixes from review
2020-02-10 09:07:17 -05:00
Jirka Borovec fc0ad03008 fix test for profiler (#800)
* fix test for profiler

* use allclose

* user relative tol
2020-02-09 17:48:37 -05:00
Jeremy Jordan 1cf430f7bc
new feature for profiling training runs (#782)
* initial implementation

* formatting, pass through profiler, docstring

* call profiler during training

* add initial tests

* report stats when training is done

* fix formatting

* error handling, bugfix in passthroughprofiler

* finish documenting profiler arg in Trainer

* relax required precision for profiling tests

* option to dump cProfiler results to text file

* use logging, format with black

* include profiler in docs

* improved logging and better docs

* appease the linter

* better summaries, wrapper for iterables

* fix typo

* allow profiler=True creation

* more documentation

* add tests for advanced profiler

* Update trainer.py

* make profilers accessible in pl.utilities

* reorg profiler files

* change import for profiler tests

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-06 22:01:21 -05:00
Adrian Wälchli 472f394788
Resolve some codefactor issues (#756)
* remove unnecessary pass statements

* use isinstance for type checks

* remove unnecessary else/elif after return

* remove unnecessary return statements

* move doc string to top

* merge isinstance calls

* remove unnecessary else/elif after raise

* use list comprehension

* do not use len without comparison

* add missing shebang

* revert isinstance check back to type

broke tests, because bool is actually subclass of int

* add missing period to doc string

* remove unnecessary pass statements

* use isinstance for type checks

* remove unnecessary else/elif after return

* remove unnecessary return statements

* move doc string to top

* merge isinstance calls

* remove unnecessary else/elif after raise

* use list comprehension

* do not use len without comparison

* add missing shebang

* revert isinstance check back to type

broke tests, because bool is actually subclass of int

* add missing period to doc string

* Fix default ckpt path when logger exists (#771)

* rename logging -> loggers (#767)

* move logging >> loggers

* add warning

* fix tests

* logging alias

* formatting

* formatting

* use isinstance for type checks

* revert isinstance check back to type

broke tests, because bool is actually subclass of int

* add more detail to tbptt example (#755)

* add more detail to tbptt example

* warn user about new arg in training_step

Co-authored-by: Vadim Bereznyuk <kuynzereb@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-02-01 18:44:05 -05:00
Jirka Borovec 76a1c67d87
rename logging -> loggers (#767)
* move logging >> loggers

* add warning

* fix tests

* logging alias

* formatting

* formatting
2020-02-01 15:47:58 -05:00
Vadim Bereznyuk 50881c0b31 Check early stopping metric in the beginning of the training (#542)
* Early stopping fix

* Update trainer.py

* Don't force validation sanity check

* fix tests

* update

* Added early_stopping check_metrics

* Updated docs

* Update docs

* Do not call early stopping when validation is disabled

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-01-23 11:12:51 -05:00
Nic Eggert dfb6d3626e Fix failing GPU tests (#722)
* Fix distributed_backend=None test

We now throw a warning instead of an exception. Update test
to reflect this.

* Fix test_tube logger close when debug=True
2020-01-21 14:26:43 -05:00
William Falcon 9e654c4ec8
Update requirements.txt 2020-01-21 08:11:22 -05:00
Jirka Borovec ea59a99426 update org paths & convert logos (#685)
* fix typos

* update org paths

* update links from READMe to docs

* add svg logo

* add svg logo-text

* update logos

* testing temp paths

* prune links from readme

* optimize imports

* update logo

* update paths in README

* missing imports
2020-01-20 14:50:31 -05:00
Z ZH de2ccc03a8 add version_ prefix to log_dir (#706)
* add version_ prefix to log_dir

* add version_ prefix
2020-01-18 07:17:53 -05:00
William Falcon bc67689068
clean v2 docs (#691)
* updated gitignore

* Update README.md

* updated gitignore

* updated links in ninja file

* updated docs

* Update README.md

* Update README.md

* finished callbacks

* finished callbacks

* finished callbacks

* fixed left menu

* added callbacks to menu

* added direct links to docs

* added direct links to docs

* added direct links to docs

* added direct links to docs

* added direct links to docs

* fixing TensorBoard (#687)

* flake8

* fix typo

* fix tensorboardlogger
drop test_tube dependence

* formatting

* fix tensorboard & tests

* upgrade Tensorboard

* test formatting separately

* try to fix JIT issue

* add tests for 1.4

* added direct links to docs

* updated gitignore

* updated links in ninja file

* updated docs

* finished callbacks

* finished callbacks

* finished callbacks

* fixed left menu

* added callbacks to menu

* added direct links to docs

* added direct links to docs

* added direct links to docs

* added direct links to docs

* added direct links to docs

* added direct links to docs

* finished rebase

* making private  members

* making private  members

* making private  members

* working on trainer docs

* working on trainer docs

* working on trainer docs

* working on trainer docs

* working on trainer docs

* working on trainer docs

* set auto dp if no backend

* working on trainer docs

* working on trainer docs

* working on trainer docs

* working on trainer docs

* working on trainer docs

* working on trainer docs

* working on trainer docs

* working on trainer docs

* fixed lightning import

* cleared  spaces

* cleared  spaces

* cleared  spaces

* cleared  spaces

* cleared  spaces

* cleared  spaces

* cleared  spaces

* cleared  spaces

* cleared  spaces

* cleared  spaces

* finished lightning module

* finished lightning module

* finished lightning module

* finished lightning module

* added callbacks

* added loggers

* added loggers

* added loggers

* added loggers

* added loggers

* added loggers

* added loggers

* added loggers

* set auto dp if no backend

* added loggers

* added loggers

* added loggers

* added loggers

* added loggers

* added loggers

* flake 8

* flake 8

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-01-17 06:03:31 -05:00
Jirka Borovec bde549cb36 unify model test acc (#696) 2020-01-17 05:50:26 -05:00
Jirka Borovec f72e354ee6 fixing TensorBoard (#687)
* flake8

* fix typo

* fix tensorboardlogger
drop test_tube dependence

* formatting

* fix tensorboard & tests

* upgrade Tensorboard

* test formatting separately

* try to fix JIT issue

* add tests for 1.4
2020-01-16 07:22:29 -05:00
Boris Dayma ec7fc97857 Feature: wandb logger (#627)
* Basic wandb support

* refactor(wandb): remove unused variables and document logger

* docs(wandb): explain how to use WandbLogger

* test(wandb): add tests for WandbLogger

* feat(wandb): add save_dir

* fix(wandb): allow pickle of logger

* fix(wandb): save logs in custom directory

* test(wandb): test import

* docs(wandb): simplify docstring and use doctest

* test: increase number of epochs for satisfactory accuracy

* test(test_load_model_from_checkpoint): ensure we load last checkpoint

Co-authored-by: Chris Van Pelt <vanpelt@wandb.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-01-13 22:25:27 -05:00
Jirka Borovec f7db44e750 fix deprecated tng and abstract ligntning (#644) 2020-01-13 22:20:38 -05:00
Jakub 8dc8a8bfd3 Neptune integration (#648)
* added neptune integration

* added tests for NeptuneLogger, added neptune to docs

* updated link to neptune support

* fixed docstrings, fixed try/except in tests, changed append_tags input

* fixed docstrings line lenght

* bumped epoch nr in model restore tests

* added tags support for single strings

* fixed passing neptune token to backend

* fixed project name in offline mode

* added save_top_k=-1 to checkpoint callback

* reformated initialization of neptune in online mode

* bumped epoch nr to 4 in test_load_model_from_checkpoint

* bumped epoch nr to 5

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-01-13 22:20:01 -05:00
Jirka Borovec db6b404748 CI pass (#671)
* fix pillow in test

* test acc

* update version in deprecated msg
2020-01-13 22:09:47 -05:00
Vadim Bereznyuk 12edc3099c Fix the number of training batches used in the training loop (#653)
* Fix the number of processed training batches

* Fix tests

* fix tests

* fix tests

* One more attempt

* Fix another test
2020-01-05 14:37:09 -05:00
Nic Eggert 019f612204 Fix amp tests (#661)
* Run AMP tests in their own process

With opt_level="O1" (the default), AMP patches many
torch functions, which breaks any tests that run afterwards.
This patch introduces a pytest extension that lets
tests be marked with @pytest.mark.spawn so that they
are run in their own process using torch.multiprocessing.spawn
so that the main python interpreter stays un-patched.

Note that tests using DDP already run AMP in its own process,
so they don't need this annotation.

* Fix AMP tests

Since AMP defaults to O1 now, DP tests no longer throw exceptions.

Since AMP patches torch functions, CPU inference no longer works.
Skip prediction step for AMP tests.

* typo
2020-01-05 14:34:25 -05:00
Jirka Borovec 5d00e62047 Fix logger, tensorboard (#610)
* fix logger tests

* fix missing flush

* fix tensorboard

* fix namespace

* fix flush

* fix add_hparams
2019-12-08 07:59:25 -08:00
Nic Eggert 5329c72cb0 Implement TensorboardLogger (#607)
* Implement TensorboardLogger

* Pass default_save_path to trainers

* Update tensorboard.py
2019-12-07 23:25:37 -05:00
Jirka Borovec 4970624f8b fix Logger tests for Win (#605)
* fix mlflow test

* fix mlflow test

* update logger / mlflow

* flake8

* fix appveyor
2019-12-07 19:25:12 -05:00
schwobr 2f01c03b38 Additional hooks (#598)
* Renamed `on_sanity_check_start` to `on_train_start` and added `on_train_end` to `ModelHooks`

* changed tests to use `on_train_start` instead of `on_sanity_check_start`
2019-12-07 08:52:06 -05:00
Elliot Waite 1051c189e1 Simplify variables: step, epoch, max_epochs, min_epochs (#589) 2019-12-07 08:50:21 -05:00
Adrian Wälchli f7e1040236 Docs and Tests for "gpus" Trainer Argument (#593)
* add table for gpus argument

* fix typo in error message

* tests for supported values

* tests for unsupported values

* fix typo

* add table for gpus argument

* fix typo in error message

* tests for supported values

* tests for unsupported values

* fix typo

* fix typo list->str

* fix travis warning "line too long"
2019-12-07 08:48:45 -05:00
Nic Eggert 0489e31b02 Fix CometML tests (#585)
* monkeypatch atexit.register to fix problem with cometml logging

* Use experiment id for version in cometml
2019-12-07 00:24:59 -05:00
Jirka Borovec 1d4b6be17b rename trainer modules, drop `_mixin` (#571)
* rename trainer modules, drop _mixin

* fix imports
2019-12-04 11:39:14 -05:00
Jirka Borovec 3a58937d8b rename variables nb -> num (#567)
* rename nb -> num

* flake8

* batch_nb, epoch_nb, gpu_nb, split_nb

* add _num deprecations
2019-12-04 06:57:10 -05:00
Jirka Borovec 63717e8fda prune tests (#564)
* format docstring in tests

* prune unused vars

* optimize imports

* drop duplicated var
2019-12-04 06:48:53 -05:00
Nic Eggert 62f6f92fdf Use pytest tmpdir fixture (#482)
* Use pytest tmpdir

* Switch to tmpdir fixtures

* Switch to tmpdir fixture

* tmpdir fixture

* Fix more conflicts
2019-12-03 08:01:04 -05:00
Jirka Borovec 47659daa5f speed-up testing (#504)
* extend CI timeout

* add short MNIST

* lower dataset and stop thr

* refactor imports

* formatting

* early stop

* play params

* play params

* minor refactoring

# Conflicts:
#	pytorch_lightning/testing/__init__.py
#	pytorch_lightning/testing/lm_test_module.py
#	pytorch_lightning/testing/lm_test_module_base.py
#	pytorch_lightning/testing/lm_test_module_mixins.py
#	pytorch_lightning/testing/model.py
#	pytorch_lightning/testing/model_base.py
#	pytorch_lightning/testing/model_mixins.py
#	pytorch_lightning/testing/test_module.py
#	pytorch_lightning/testing/test_module_base.py
#	pytorch_lightning/testing/test_module_mixins.py

* typo

Co-Authored-By: Ir1dXD <sirius.caffrey@gmail.com>

* Revert "refactor imports"

This reverts commit b86aee92

* update imports
2019-11-28 12:06:05 -05:00
Jirka Borovec 9785a3e78e Refactor: name modules (#548)
* refactor: rename some modules

* add deprecation warnings

* fix paths
2019-11-26 22:39:18 -05:00
Ir1dXD 7324dd902b change Checkpoint callback's `save_best_only` to `save_top_k` (#128)
* docs: enable syntax highlight

* feat: change Checkpoint callback's `save_best_only` to `save_top_k`

fix #70

* docs: update docs for save_top_k

* revert other files

* style: lint for travis-ci

* fix typo

* make flake8 happy

* update according to review

* add tests

* rename func to private

* add doc on `save_top_k == 0`

* make flake8 happy

* update according to PR comments

* change some f-strings

* Update pt_callbacks.py

* Update test_models.py

* update options

* create folders

* Update test_models.py

* change epoch num

* support calling multiple times, add docs and tests

* update docs

* roll back changes in earlystopping

* clean test files

* make flake8 happy

* fix epoch number

* update tests about epoch numbers

* clean debugging code

* fix testing utils codes

* fix testing utils codes

* fix testing utils codes

* fix testing utils codes

* change save_dir to tests/tests according to previous lines

* remove unused overwrite option

* make flake8 happy

* change var name as per review

* make flake8 happy

* update property name to work on master

* elaborate in the docs

* update docs as per review

* revert previous commit

accidentally pressed wrong button when solving conflicts
2019-11-19 15:43:34 -08:00
rwesterman d1b6b011c3 Comet fix (#481)
* Fixing comet ml bug and adding functionality

* Updating documents

* Fixing code style issues in comet_logger

* Changing comet_logger experiment to execute lazily

* Adding tests for comet_logger and addressing comments from @Borda

* Setting step_num to optional keyword argument in log_metrics() to comply to other loggers

* Adding offline logging mode for comet_ml, updating tests and docs

* Switching to MisconfigurationException
2019-11-11 23:00:31 -05:00
Jirka Borovec 1fd1e42aa6 Fix setup-doc for pypi (#472)
* add Twine to CI

* freeze Twine

* freeze Twine

* minor refactoring

* try another

* fix req.

* update README

* fix __doc__

* fix multiple req. test-tube
2019-11-09 00:59:14 -05:00
Nic Eggert 9fa2806605 Fix ModelCheckpoint default paths (#413)
* Make name and version properties required

* Warn before deleting files in checkpoint directory

* Get default checkpoint path from any logger

* Fix typos

* Uncomment logger tests

* Whitespace

* Update callback_config_mixin.py

checkpoints and version file names would just have a number. it's easy to tell what you're looking at with version_ prepended

* Address comments

* Fix broken tests
2019-11-05 10:41:59 -05:00
Yongrae Jo 32dd803b1e Fix min_max gpu memory logging bug (#453)
* #452 Fix ValueError

* #452 Use subprocess.run

* #452 Simplify code for gpu_memory_map

* #452 Simplify code for min max memory

* #452 Add test for get_memory_profile

* #452 Use os.sep

* #452 Use os.linesep
2019-11-05 08:55:44 -05:00
Ir1dXD 5a9afb11cc change print to logging (#457)
* change print to logging

* always use logging.info

* use f-strings

* update code style

* set logging configs

* remove unused code
2019-11-05 08:43:21 -05:00
William Falcon 37729f0a17
fixing test (#451) 2019-11-03 08:52:22 -05:00
Tullie Murrell 248495b1d1 Add tbptt (#429)
* Add truncated bptt

* Fix rebase error

* AutoPep8

* Address comments, incl default bptt_split impl

* Add tbptt test

* Add default split for lists/tuples

* Add tbptt docs

* Fix trainer spacing

* Update RequiredTrainerInterface.md
2019-10-31 06:45:28 -04:00