lightning

Commit Graph

Author	SHA1	Message	Date
William Falcon	890458fdbd	Fixes automatic parser bug (#1585 ) * fixes gpu parsing * fixes gpu parsing	2020-04-23 21:00:41 -04:00
Adrian Wälchli	3e8f2d99a9	Progress bar callback (#1450 ) * squash and rebase sanity check hooks sanity check callback hook finish moved core progress bar functionality into callback wip remove duplicate merge clean up imports docs sanity check progress bar main sanity move callback calls init progrss bar callback configuration and docs changelog rate decorator pass process_position disable on rank > 0 position index is_enabled remove decorator refactor init tqdm bars callback method ordering cannot reset when disabled sequence -> list default values fix has no attr _time() move on_val_end to proper place fix the pickle issue update warning properties check for None remove old comment switch order pull out non-tqdm functionality into base class documentation for the base class docs fix refresh rate issue in validation restrict type hint of trainer arg more docs update trainer docs rst docs fix lines too long fix test add missing type hints fix typo move docstring to __init__ solves doctest failures remove doctest :(( can't fix the pickle error fix example simplify by saving trainer reference fix docs errors move docstring initial value multiple val checks per epoch simpler handling of inf dataset sizes update inf docs renamed training_tqdm_dict rename get_tqdm_dict rename occurences of tqdm update changelog fix doctest fix formatting errors added callback tests progress bar on off test more tests for progress bar weird test fix? add ignored property disable default progress bar in LR finder change enable/disable behavior trying doctest in CI again undo doctest pickle error undo doctest pickle error :(( remove progress_bar_callback Trainer arg and fix tests restore progress bar after auto lr find update docs fix rebase fix wrong negation * fix fast dev run total * more thorough testing * remove old args * fix merge * fix merge * separate tests * type hint total batches * reduce if Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * is_disabled Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * is_enabled Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * rename enabled/disabled * move deprecated api * remove duplicated test from merge * fix rename is_disabled * newline * test also testprogress for fast dev run Co-authored-by: J. Borovec <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-23 20:46:18 -04:00
Guy Davidson	fe2b6666e0	Fixing a small issue in trainer logging (#1563 ) * The epoch was being logged to metrics, which isn't read, rather than to current_metrics. * Updated the tests to account for the epoch arriving at the logger.	2020-04-23 17:52:41 -04:00
Jirka Borovec	7989ca844c	test deprecation warnings (#1470 ) * check deprecation warnings * extend warning test * try * unimport modules * update	2020-04-23 17:34:47 -04:00
Jirka Borovec	0b22b64a10	Tests/docker (#1573 ) * devel image * try parallel * new image	2020-04-23 12:52:59 -04:00
Nicki Skafte	e977d1cde5	Default value for ModelCheckpoint filepath (#1548 ) * allow determine of filepath at runtime * typing Co-authored-by: Nicki Skafte <nugginea@gmail.com>	2020-04-23 11:50:58 -04:00
Travis Addair	7024177f7d	Added Horovod distributed backend (#1529 ) * Initial commit of Horovod distributed backend implementation * Update distrib_data_parallel.py * Update distrib_data_parallel.py * Update tests/models/test_horovod.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/models/test_horovod.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Fixed tests * Added six * tests * Install tox for GitHub CI * Retry tests * Catch all exceptions * Skip cache * Remove tox * Restore pip cache * Remove the cache * Restore pip cache * Remove AMP Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-22 17:39:08 -04:00
Jirka Borovec	c1c6e3b6c9	default test logger (#1478 ) * default test logger * fix tests * spawn * try * simplify tests * simplify tests * formatting * loggers * loggers * revert to TestTube * default * default * wraps * world size * optim imports	2020-04-21 20:33:10 -04:00
Jirka Borovec	bd168819f2	fix changelog (#1452 ) * fix changelog * formatting * add ddp_cpu * docs * add another	2020-04-20 17:36:26 -04:00
Adrian Wälchli	452fa858f4	skip warning test (#1533 )	2020-04-20 08:04:37 +00:00
William Falcon	ae2e14e3ed	fixed memory leak from opt return (#1528 ) * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return	2020-04-19 16:41:54 -04:00
Adrian Wälchli	3c549e8ae3	Call on_before_zero_grad model hook (#1493 ) * call on_before_zero_grad * update changelog * add note about overriding both hooks * added test * move test_hooks.py to models folder	2020-04-16 12:01:41 -04:00
Nic Eggert	e3001a0929	Add ddp_cpu backend for testing ddp without GPUs (#1158 ) * Add tests for distributed backend config * Refactor set_distributed_mode * Use gloo backend on cpu * Use 127.0.0.1 instead of 127.0.0.2 Not totally clear on why this is necessary, but it seemt to work * Update LightningDDP so that it works with CPU * Add ddp_cpu backend and num_processes Trainer arg * PEP8 * Fix test skipping. Inequalities are hard :/ * Skip ddp_cpu test on Windows * Make a few more cases fall back to ddp_cpu * New function name * Flake8 * Don't test distributed on MacOS with torch < 1.3 Support for distributed in MacOS was added in Torch 1.3.0 * Add ddp_cpu and num_processes to docs * Parametrize trainer config tests * Tweak warning Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Remove redundant test * Replace pass branches with comments * Add missing warnings import * save_path -> root_dir * Use new rank_zero_warn * Whitespace * Apply suggestions from code review * formatting Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-15 23:17:31 -04:00
William Falcon	3431c62d41	Remove error when test dataloader used in test (#1495 ) * remove error when test dataloader used in test * remove error when test dataloader used in test * remove error when test dataloader used in test * remove error when test dataloader used in test * remove error when test dataloader used in test * remove error when test dataloader used in test * fix lost model reference * remove error when test dataloader used in test * fix lost model reference * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * added tests for warning * fix lost model reference * fix lost model reference * added tests for warning * added tests for warning * refactoring * refactoring * fix imports * refactoring * fix imports * refactoring * fix tests * fix mnist * flake8 * review Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-15 22:16:40 -04:00
Jirka Borovec	8322f1b039	neptune online (#1499 )	2020-04-15 11:14:29 -04:00
Jirka Borovec	b3fe17ddeb	fix flushing loggers (#1459 ) * flushing loggers * flushing loggers * flushing loggers * flushing loggers * changelog * typo * fix trains * optimize imports * add logger test all * add logger test pickle * flake8 * fix benchmark * hanging loggers * try * del * all * cleaning	2020-04-14 20:32:33 -04:00
William Falcon	c96c6a6b33	attempting to remove some speed issues (#1482 ) * removed some .items * added speed tests * added speed tests * Update benchmarks/test_rnn_parity.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update benchmarks/test_trainer_parity.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * fix lost model reference * added speed tests Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-14 20:23:36 -04:00
Ethan Harris	8544b334e4	Replace automatic nan check with optional flag (#1475 ) * Replace automatic nan check with optional flag * Update CHANGELOG.md	2020-04-13 14:06:25 -04:00
Nicki Skafte	3f09b32df3	Learning Rate finder (#1347 ) * initial structure * rebase * incorporate suggestions * update CHANGELOG.md * initial docs * fixes based on reviews * added trainer arg * update docs * added saving/restore of model state * initial tests * fix styling * added more tests * fix docs, backward compatility and progressbar * fix styling * docs update * updates based on review * changed saving to standard functions * consistent naming * fix formatting * improve docs, added support for nested fields, improve codecov * update CHANGELOG.md * Update lr_finder.rst * Update pytorch_lightning/trainer/trainer.py * Update trainer.py * Update CHANGELOG.md * Update path * restoring * test * attribs * docs * doc typo Co-authored-by: Nicki Skafte <nugginea@gmail.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-10 14:34:23 -04:00
Jirka Borovec	d05ac813dc	fix deprecated default_save_path (#1449 )	2020-04-10 14:32:56 -04:00
William Falcon	b78c3d4da8	Fix weights path (#1445 ) * renamed default path to actual root_dir * added default weights path * added default weights path * added default weights path	2020-04-10 12:02:59 -04:00
Allard Hendriksen	7ac1580a31	Add automatic GPU choice to trainer (#1426 ) * Add automatic GPU choice to trainer This commit adds the `gpu_choice` parameter to Trainer. By default, this parameter is set to 'manual' which causes no observable difference in behavior. When `gpu_choice` is set to "auto" and `gpus` is an int, then the trainer will automatically allocate the first available GPU. This is especially useful when GPUs are configured to be in "exclusive mode", which means that only one process at a time can use them. * Rename gpu_choice -> auto_select_gpus	2020-04-10 11:45:29 -04:00
Rohit Gupta	e79ae18cae	Add test_dataloaders to test method (#1434 ) * Add test_dataloaders to test method * Remove test_dataloaders from .fit() * Fix code comment * Fix tests * Add test_dataloaders to test method (#1393) * Fix failing tests * Update docs (#1393)	2020-04-10 11:44:03 -04:00
Alexey Karnachev	4c34d16a34	Fixed configure optimizer from dict without "scheduler" key (#1443 ) * `configure_optimizer` from dict with only "optimizer" key. bug fixed * autopep8 * pep8speaks suggested fixes * CHANGELOG.md upd	2020-04-10 11:43:06 -04:00
Alex Sergeev	8dd9b80d7a	Fix gradient clipping (#1438 ) * Fix gradient clipping * Relax accuracy constraint	2020-04-09 21:08:28 -04:00
Jirka Borovec	17f58d2e11	add rank warning (#1428 ) * add rank warning * changelog * use rank_zero_warn * user trainer_init * replace warnings * fix test * flake8 * docs * changelog * bug lol	2020-04-09 14:05:46 -04:00
Alexey Karnachev	ddbf7de6dc	Added accumulation of loggers' metrics for the same steps (#1278 ) * `add_argparse_args` method fixed (argument types added) * autopep8 fixes * --gpus=0 removed from test (for ci tests) * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Joe Davison <joe@huggingface.co> * test_with_accumulate_grad_batches added * agg_and_log_metrics logic added to the base logger class * small format fix * agg metrics strategies removed (not to complicate stuff) * agg metrics: handle zero step * autopep8 * changelog upd * flake fix * metrics aggregators factored out, metrics_agg.py added + tests * metrics agg default value added * Update pytorch_lightning/loggers/metrics_agg.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * metrics aggregators factored out, metrics_agg.py added + tests * metrics agg default value added * Update pytorch_lightning/loggers/metrics_agg.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * remove .item which causes sync issues (#1254) * remove .item which causes sync issues * fixed gradient acc sched * fixed gradient acc sched * test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored * test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored * autopep8 * loggers base.py types fixed * test * test * metrics aggregation for loggers: each key now has a specific function (or default one) * metrics aggregation for loggers: each key now has a specific function (or default one) * docstrings upd * manual typehints removed from docstrings * batch_size decreased for test `test_with_accumulate_grad_batches` * extend running accum * refactor * fix tests * fix tests * allowed_types generator scoped * trainer.py distutils was imported twice, fixed * TensorRunningAccum refactored * TensorRunningAccum added to change log (Changed) * change log pull link added Co-authored-by: Joe Davison <joe@huggingface.co> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-08 08:35:47 -04:00
Jeremy Jordan	91c9b29d47	add trainer attribute to denote if interrupted (#1368 ) * add trainer attribute to denote if interrupted * bugfix and formatting	2020-04-05 11:12:41 -04:00
Ethan Harris	b18accc64c	Add warning for few workers (#1378 ) * Add warning for few workers * Fix style issue * Update CHANGELOG.md * Update test * formatting * formatting Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-05 11:07:16 -04:00
Justus Schock	f6a86e8551	generalize reinstantiation of dataloader (#1346 ) * generalize reinstantiation of dataloader * fix condition * add test * update changelog * fix changelog Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-03 17:55:08 -04:00
William Falcon	3c5530c29d	Wandb bug/wandb multi (#1360 ) * Allow reinits in sub procs * Dont create an experiment on pickle, name, or project * Comments consistency * Fix test * Apply suggestions from code review Co-authored-by: Chris Van Pelt <vanpelt@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-03 15:03:00 -04:00
William Falcon	dd5a05926c	Borisdayma: fix(wandb) - fix watch method (#1361 ) * fix(wandb): fix watch method * rebased * Apply suggestions from code review Co-authored-by: Boris Dayma <boris.dayma@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-03 15:02:38 -04:00
Adrian Wälchli	ebd9fc9530	Fix for incorrect run on the validation set with overwritten validation_epoch_end and test_end (#1353 ) * reorder if clauses * fix wrong method overload in test * fix formatting * update change_log * fix line too long	2020-04-03 09:25:32 -04:00
Jean-Baptiste SCHIRATTI	868b172f05	Make training_epoch_end behave like validation_epoch_end (#1357 ) * Make training_epoch_end behave like validation_epoch_end + minor fixes in docstrings. * Minor fixes (Borda's comments). * Detach tensors in batch_output (to avoid possible memory leak) + doc fix. Co-authored-by: Jean-Baptiste SCHIRATTI <jean-baptisteschiratti@MacBook-Pro-de-Jean-Baptiste.local>	2020-04-03 14:43:26 +02:00
Gerard Bentley	f33b5a8d99	Simplify progress bar args (#1108 ) * show progress bar dependent on refresh_rate * test progress_bar_refresh control show bar * remove show_progress_bar from other tests * borda fixes * flake8 fix * changelog update prog bar refresh rate * move show_progress_bar to deprecated 0.9 api * rm show_progress_bar references, test deprecated * Update pytorch_lightning/trainer/__init__.py * fix test * changelog * minor CHANGELOG.md format * Update pytorch_lightning/trainer/__init__.py * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Gerard Bentley <gbkh2015@mymail.pomona.edu> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-03 00:53:00 +02:00
Jirka Borovec	724b787cd1	faster CI testing (#1323 ) * MNIST digits * increase test acc * smaller parity * drone builds * increase GH action timeout * drone format * fix paths * drone cache * circle cache * fix test * lower nb epochs * circleCI * user orb * fix test * fix test * circle cache * circle cache * circle cache * comment caches * benchmark batch size * cache dataset * smaller dataset * smaller dataset * fix nb samples * batch size * fix test	2020-04-02 12:28:44 -04:00
Nicki Skafte	2912239fe6	Add useful errors when model is not configured correctly (#1199 ) * add check_model_configuration method * trying to fix errors * trying to fix tests * added test_epoch_end to lightning template * fix tests * fix new test after rebase * fix spelling * added more checks * updated formating * added tests * fixed CHANGELOG * Apply suggestions from code review * move test to new module * change check on configure_optimizers Co-authored-by: Nicki Skafte <nugginea@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-02 11:53:37 -04:00
Ethan Harris	28242f02d1	Remove default optimizer, add None optimizer option (#1279 ) * Add warning when using default optimizer * Refactor optimizer tests to test_optimizers * Remove default optimizer, add option to use no optimizer * Update CHANGELOG.md * Update pytorch_lightning/trainer/optimizers.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Fix style Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-02 11:48:53 -04:00
Asaf Manor	aca8c7e6f3	Optimizer Frequencies logic, and new configure_optimizers (#1269 ) * init_optimizers accepts Dict, Sequence[Dict] and returns optimizer_frequencies. optimizer_frequencies was added as a member of Trainer. * Optimizer frequencies logic implemented in training_loop. Description added to configure_optimizers in LightningModule * optimizer frequencies tests added to test_gpu * Fixed formatting for merging PR #1269 * Apply suggestions from code review * Apply suggestions from code review Co-Authored-By: Asaf Manor <32155911+asafmanor@users.noreply.github.com> * Update trainer.py * Moving get_optimizers_iterable() outside. * Update note * Apply suggestions from code review * formatting * formatting * Update CHANGELOG.md * formatting * Update CHANGELOG.md Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-31 16:41:24 +00:00
Adrian Wälchli	d6646e151a	Move some tests to correct subfolder/file (#1312 ) * move some tests to trainer file * fix imports	2020-03-31 08:58:46 -04:00
Jirka Borovec	6ddb03922a	Profiler summary (#1259 ) * refactor and add types * add Prorfiler summary * fix imports * Revert "refactor and add types" This reverts commit b4c552fa * changelog * revert rename * fix test * mute verbose	2020-03-31 08:57:48 -04:00
Adrian Wälchli	1aba411da9	Early stopping when validation is disabled (#1235 ) * early stop fallback to train epoch * added test * fix imports * update docs * update changelog * fix typo	2020-03-31 06:24:26 +00:00
Bilal Khan	a707d4bea1	Replace Wandb callback's finalize with no-op (#1193 ) * Replace Wandb callback's finalize with no-op * Update pytorch_lightning/loggers/wandb.py * Update wandb.py * remove wandb logger's finalize and update tests * update changelog Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-30 18:45:06 -04:00
Nicki Skafte	2ccc7456ca	Error on zero length dataloaders (#1280 ) * error_on_zero_length * update CHANGELOG.md * added test * Update pytorch_lightning/trainer/data_loading.py Co-authored-by: Nicki Skafte <nugginea@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-30 18:43:53 -04:00
Jirka Borovec	09167efdb5	Checkpointing interval (#1272 ) * formatting * formatting * fix interval * fix train loop * fix test * parametrize test * Apply suggestions from code review Co-Authored-By: Adrian Wälchli <adrian.waelchli@students.unibe.ch> * fix calling * flake8 * add types Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-30 18:37:02 -04:00
Jirka Borovec	2ca5356429	clear skipping tests (#1285 ) * clear skipping tests * fix simple/multi GPU * review: simplify	2020-03-30 18:29:23 -04:00
Jirka Borovec	31017120fd	fix incomplete RunningMean (#1309 ) * fix RunningMean * changelog * fix none * Update supporters.py just needed to multiply by zero for init * Revert "Update supporters.py" This reverts commit `7e0da6c6` * fix NaN * formatting Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-30 18:28:31 -04:00
Adrian Wälchli	b7de42f70d	Add MNIST dataset & drop torchvision dep. from tests (#986 ) * added custom mnist without torchvision dep * move files so it does not conflict with mnist gitignore * mock torchvision for tests * fix line too long * fix line too long * fix "module level import not at top of file" warning * move mock imports to __init__.py * simplify MNIST a lot and download directly the .pt files * further simplify and clean up mnist * revert import overrides * make as before * drop PIL requirement * move mnist.py to datasets subfolder * use logging instead of print * choose same name as in torchvision * remove torchvision and pillow also from yml file * refactor if train Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * capitalized class attr * moved mnist to models * re-added datsets ignore * better name for file variable * Update mnist.py * move dataset classes to datasets.py * new line * update * update * fix automerge * move to base folder * adapt testingmnist to new mnist base class * remove temporal fix * fix datatype * remove old testingmnist * readable * fix import * fix whitespace * docstring Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/base/datasets.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * changelog * added types * Update CHANGELOG.md Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * exist->isfile Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * index -> idx * temporary fix for trains error * better changelog message Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-30 18:25:37 -04:00
Jirka Borovec	c869dd8b8f	make evaluate private (#1260 ) * make evaluate private * changelog	2020-03-30 12:14:27 -04:00
Ethan Harris	ab09faa15e	Add support for iterable datasets when val_check_interval=1.0 (#1283 ) * Add support for iterable datasets when val_check_interval=1.0 * Update CHANGELOG.md	2020-03-29 15:27:44 -04:00
Jeremy Jordan	54507f417e	fix logging config and add profiler test (#1267 )	2020-03-29 14:56:36 -04:00
Jirka Borovec	61177cd1c8	system info (#1234 ) * system info * update big info * test script * update config * rename script * import path	2020-03-27 08:45:52 -04:00
Tyler Yep	6772e0c197	Remove unnecessary parameters to super() in documentation and source code (#1240 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-27 12:36:50 +00:00
Jeremy Jordan	d394b80ac8	calling self.forward() -> self() (#1211 ) * self.forward() -> self() * update changelog Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-27 08:17:56 +01:00
Adrian Wälchli	2a4cd479e2	Disable validation when val_percent_check=0 (#1251 ) * fix disable validation * add test * update changelog * update docs for val_percent_check * make "fast training" docs consistent	2020-03-27 02:07:22 +00:00
Jirka Borovec	45d671a4a8	CI: split tests-examples (#990 ) * CI: split tests-examples * tests without template * comment depends * CircleCI typo * add doctest * update test req. * CI tests * setup macOS * longer train * lover pred acc * fix model * rename default model * lower tests acc * typo * imports * fix test optimizer * update calls * fix Win * lower Drone image * fix call * pytorch image * fix test * add dev image * add dev image * update image * drone volume * lint * update test notes * rename tests/models >> tests/base * group models * conftest * optim imports * typos * fix import * fix tests * install AMP * tests * fix import	2020-03-25 07:46:27 -04:00
Alexey Karnachev	ced662fc27	Custom argparser extension with Trainer arguments (argument types added) (#1147 ) * `add_argparse_args` method fixed (argument types added) * CHANGELOG.md upd * autopep8 fixes * --gpus=0 removed from test (for ci tests) * typo fixed * reduce on plateau scheduler fixed * Trainer cli related tests moved to test_trainer_cli.py * refactored: get_init_arguments_and_types is a public classmethod of the Trainer now * test_get_init_arguments_and_types added * autopep8 fixes * Trainer cli related tests moved to test_trainer_cli.py * refactored: get_init_arguments_and_types is a public classmethod of the Trainer now * test_get_init_arguments_and_types added * autopep8 fixes * Trainer cli related tests moved to test_trainer_cli.py * refactored: get_init_arguments_and_types is a public classmethod of the Trainer now * test_get_init_arguments_and_types added * autopep8 fixes * Trainer cli related tests moved to test_trainer_cli.py * test_get_init_arguments_and_types added * autopep8 fixes * Apply suggestions from code review * cosmetics * cosmetics * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * `Trainer.get_init_arguments_and_types` now returns arg types wrapped in tuples (not in sets) * deprecated args are now ignored in argparser * get_deprecated_arg_names small refactor * get_deprecated_arg_names bug fixed * Trainer cli related tests moved to test_trainer_cli.py * refactored: get_init_arguments_and_types is a public classmethod of the Trainer now * test_get_init_arguments_and_types added * autopep8 fixes * Trainer cli related tests moved to test_trainer_cli.py * autopep8 fixes * Trainer cli related tests moved to test_trainer_cli.py * Trainer cli related tests moved to test_trainer_cli.py * test_get_init_arguments_and_types added * autopep8 fixes * autopep8 fixes * Apply suggestions from code review * cosmetics * cosmetics * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * `Trainer.get_init_arguments_and_types` now returns arg types wrapped in tuples (not in sets) * deprecated args are now ignored in argparser * get_deprecated_arg_names small refactor * get_deprecated_arg_names bug fixed * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Joe Davison <joe@huggingface.co> * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Joe Davison <joe@huggingface.co> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Joe Davison <joe@huggingface.co> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-24 14:55:27 -04:00
Jeremy Jordan	4c2026bf9a	increase profiler test coverage (#1208 ) * increase profiler test coverage * fix line length * tests for valueerror assertions	2020-03-24 09:15:16 -04:00
Jirka Borovec	3be81cb54e	test deprecated - model (#1074 ) * pylint * model API * update test * formatting * disable logger * fix checking overwrite * fix test * typo * deprecated model * fix for DDP * drop Flake8 in GH actions * Update pytorch_lightning/trainer/evaluation_loop.py * fix imports Co-authored-by: Nic Eggert <nic@eggert.io>	2020-03-20 20:51:14 +01:00
Adrian Wälchli	732eaee4d7	nan detection and intervention (#1097 ) * check for nan values * test nan detection on loss * sys.exit * whitespace * detect nan and inf values in loss and params * update * added documentation * moved detect nan to training loop, remove flag for print * blank line * test * rename * deprecate print_nan_grads * deprecated print_nan_grads * remove unused imports * update changelog * fix line too long * correct deprecated version Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * raise exception instead of sysexit Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * raise exception instead of sysexit Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/training_tricks.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/training_tricks.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * fix test Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-19 09:24:45 -04:00
So Uchida	01b8991c5a	Support hierarchical dict (#1152 ) * Add support for hierarchical dict * Support nested Namespace * Add docstring * Migrate hparam flattening to each logger * Modify URLs in CHANGELOG * typo * Simplify the conditional branch about Namespace Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update CHANGELOG.md Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * added examples section to docstring * renamed _dict -> input_dict Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-19 09:15:47 -04:00
Jirka Borovec	22a7264e9a	improve partial Codecov (#1172 ) * ignore in setup * show report * abs imports * abstract pass * cover loggers * doctest trains * locals * pass * revert tensorboard * use tensorboardX * revert tensorboardX * fix trains * Add TrainsLogger.set_credentials (#1179) * Add TrainsLogger.set_credentials to control trains server configuration and authentication from code. Sync trains package version. Fix CI Trains tests * Add global TrainsLogger set_bypass_mode (#1187) * Add global TrainsLogger set_bypass_mode skips all external communication Co-authored-by: bmartinn <> * rm some no-cov Co-authored-by: Martin.B <51887611+bmartinn@users.noreply.github.com>	2020-03-19 09:14:29 -04:00
Nicki Skafte	384e124490	ReduceLROnPlateau bug fix (#1126 ) * bug fix and test * update CHANGELOG.md Co-authored-by: Nicki Skafte <nugginea@gmail.com>	2020-03-16 14:35:10 -04:00
Jakub	3ad6169f18	Neptune Logger Improvements (#1084 ) * removed project and experiment from getstate * added tests for closing experiment, updated token in example to user neptuner * updated teoken * Update neptune.py added a link to example experiment * added exmaple experiment link * dropped duplication * flake fixes * merged with master, added changes information to CHANGELOG	2020-03-14 13:02:40 -04:00
Martin.B	c0bedd2587	Add TRAINS experiment manager support (#1122 ) * Add allegro.ai TRAINS experiment manager support * improve docstring and type hinting, fix the bug in log_metrics, add support torch.Tensor to input into log_image * complete missing docstring of constructor's arguments * fix docs * pep8 * pep8 * remove redundant typing use logging fix typing and pep8 * remove deprecated interface * add TrainsLogger test * add TrainsLogger PR in CHANGELOG * add id/name property documentation * change logging as log Co-authored-by: bmartinn <> Co-authored-by: Sou Uchida <s.aiueo32@gmail.com>	2020-03-14 13:02:14 -04:00
monney	da61398835	Add Support for Non-primitive types in TensorboardLogger (#1130 ) * Added support for non-primitive types to tensorboard logger * added EOF newline * PEP8 * Updated CHANGELOG for PR #1130. Moved _sanitize_params to base logger. Cleaned up _sanitize_params * Updated CHANGELOG for PR #1130. Moved _sanitize_params to base logger. Cleaned up _sanitize_params * changed convert_params to static method * PEP8 * Cleanup Doctest for _sanitize_params Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Removed OrderedDict import * Updated import order to conventions Co-authored-by: Manbir Gulati <manbirgulati@Manbirs-MBP.hsd1.md.comcast.net> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-14 13:02:05 -04:00
Jirka Borovec	1d5f06223a	fix tmpdir (#1012 ) * fix tmpdir * just str path	2020-03-12 12:46:25 -04:00
Ethan Harris	2b3f443f6b	Add support for IterableDatasets everywhere (#1104 ) * Add support for IterableDatasets everywhere * Added type hints, simplified code and improved coverage in data_loading.py * Update CHANGELOG.md	2020-03-12 12:46:02 -04:00
Jirka Borovec	514d182b7f	cleaning imports (#1032 )	2020-03-12 12:41:37 -04:00
Jirka Borovec	4896815067	remove deprecated `data_loader` (#1077 ) * change version in CHangelog * warning * remove der data_loader Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-06 16:11:05 -05:00
William Falcon	3d18099262	removed decorators (#1079 )	2020-03-06 16:09:47 -05:00
Jirka Borovec	ff1f8ef400	Test deprecated API for 0.8.0 and 0.9.0 (#1071 ) * till 0.8 * refactor * fix tests * fix tests * deprx till 0.9 * Update trainer.py * Apply suggestions from code review Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-06 21:36:44 +01:00
William Falcon	0ebfb78570	Examples: using new API (#1056 ) * using new API * typo	2020-03-05 19:31:57 -05:00
William Falcon	969e929a48	Learning rate stepping option (#941 ) * remove deprecated args to learning rate step function * step based scheduler * mixing models for testing * fix styling * tests * update documentation * smaller fix * update to dict structure * updated test * update documentation * update CHANGELOG.md * fix styling * fix problems with trainer io * fix tests * simplification of code * fix styling * change from batch to step * update to tests * fix styling * fixed some logic * Update pytorch_lightning/core/lightning.py * duplicated test * fix test on amp * small update to tests * added monitor key for ReduceLROnPlateau * Update trainer.py * Update training_loop.py * fix test after introducing monitor keyword Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-05 06:48:54 -05:00
William Falcon	bcb45d906d	proper checkpoint implementation (#1043 ) * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * name formatting * version * testing * add test * fix test * Update model_checkpoint.py * doctests * pylint * tests * debug * debug * enabled early stopping/checkpooiunt even without val step * fix MNIST download (#1044) * fix MNIST download * simple * name formatting * version * testing * add test * fix test * doctests * tests * debug * debug * rebased 1041 * rebased 1041 * tests * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-04 23:02:19 -05:00
William Falcon	165b9fb3f3	fix MNIST download (#1044 ) * fix MNIST download * simple	2020-03-04 17:57:26 -05:00
Jirka Borovec	e586ed4767	hparams as dict [blocked by 1041] (#1029 ) * hparams as dict * hparams as dict * fixing * fixing * fixing * fixing * typing * typing * chnagelog * update set hparams * use setter * simplify * chnagelog * imports * pylint * typing * Update training_io.py * Update training_io.py * Update lightning.py * Update test_trainer.py * Update __init__.py * Update base.py * Update utils.py * Update test_trainer.py * Update training_io.py * Update test_trainer.py * Update test_trainer.py * Update test_trainer.py * Update test_trainer.py * Update callback_config.py * Update callback_config.py * Update test_trainer.py Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-04 09:33:39 -05:00
Jirka Borovec	64de57b09e	update checkpoint docs (#1016 ) * update checkpoint docs * fix tests * fix tests * formatting * typing * filename * fix tests * fixing tests * fixing tests * fixing tests * unique name * fixing * fixing * Update model_checkpoint.py Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-03 15:16:57 -05:00
William Falcon	4c5e82c065	Skepticleo trainer argparser (#1023 ) * Added default parser for trainer and class method to construct trainer from default args * Removed print statement * Added test for constructing Trainer from command line args * Removed extra line * Removed redundant imports, removed whitespace from empty lines * Fixed typo * Updated default parser creation to get class attributes automatically * Updated default parser creation to get class attributes automatically * Added method to get default args for trainer * Trimmed trainer get default args method * Updated from argparse method to not return trainer with static arguments * Update trainer get default args to classmethod * adjustment * fix * Fixed variable name * Update trainer.py * Update test_trainer.py * Update trainer.py * Update tests/trainer/test_trainer.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update trainer.py * Update test_trainer.py * Update trainer.py * Update test_trainer.py * Update tests/trainer/test_trainer.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update trainer.py * Update test_trainer.py Co-authored-by: Mudit Tanwani <mudittanwani@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-03 09:32:15 -05:00
Jeremy Jordan	705e576417	consolidate callbacks and hooks (#950 ) * consolidate callbacks and hooks * ensure callbacks recieve proper arg types * remove model from init callback events * clean up early stopping event * update changelog * remove on_fit_start and on_fit_end * fix args for on_init_start and on_init_end * handle case where early stopping is not used * show all callback methods * wrap checkpoint callback logic into proper class * fix check for main process in checkpoint callback * move callbacks test to separate file * refactor arg checks * get model and call hook on same line * define trainer_options dict in one call * add more asserts to callback test	2020-03-02 23:51:32 -05:00
Adrian Wälchli	5458d05cd8	Merge load functions (#995 ) * Update README.md * Update README.md * Use callable object for patching dataloaders (#971) * Use callable object for patching dataloaders * Add test for ddp with dataloaders passed to fit() * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * merge load functions * update tests * fix documentation warnings * fix line too long * fix line too long * print deprecation warning Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * move tags_csv argument to end of signature * fix typo, update version numbers * fix line too long * add typing as requested * update changelog Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Sho Arora <sho854@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-02 21:05:38 -05:00
Ethan Harris	f862d9f691	Logger tests and fixes (#1009 ) * Refactor logger tests * Update and add tests for wandb logger * Update and add tests for logger bases * Update and add tests for mlflow logger * Improve coverage * Updates * Update CHANGELOG * Updates * Fix style * Fix style * Updates	2020-03-02 20:49:14 -05:00
William Falcon	2a04be0386	No auto load weights (#985 ) * remove autoload * remove autoload * added weights loading docs * checkpoint loading saving docs * checkpoint loading saving docs * checkpoint loading saving docs * docs (#1010) * remove autoload * remove autoload * added weights loading docs * checkpoint loading saving docs * checkpoint loading saving docs * checkpoint loading saving docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs	2020-03-02 17:12:22 -05:00
Sho Arora	d69455a466	Use callable object for patching dataloaders (#971 ) * Use callable object for patching dataloaders * Add test for ddp with dataloaders passed to fit() * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-02 17:11:54 +01:00
William Falcon	ad80a7d638	clean docs (#967 ) * clean docs * clean docs * clean docs	2020-02-27 17:21:51 -05:00
Jirka Borovec	7beed7cae6	Trainer cleanup (#934 ) * Trainer cleanup * update abstract * remove ... * remove __init__ * update mixin types * update callbacks * fix * lower test acc	2020-02-27 16:21:14 -05:00
Hanbyul Kim	563e2ba2c6	resolving documentation warnings (#833 ) * add more underline * fix LightningMudule import error * remove unneeded blank line * escape asterisk to fix inline emphasis warning * add PULL_REQUEST_TEMPLATE.md * add __init__.py and import imagenet_example * fix duplicate label * add noindex option to fix duplicate object warnings * remove unexpected indent * refer explicit LightningModule * fix minor bug * refer EarlyStopping explicitly * restore exclude patterns * change the way how to refer class * remove unused import * update badges & drop Travis/Appveyor (#826) * drop Travis * drop Appveyor * update badges * fix missing PyPI images & CI badges (#853) * docs - anchor links (#848) * docs - add links * add desc. * add Greeting action (#843) * add Greeting action * Update greetings.yml Co-authored-by: William Falcon <waf2107@columbia.edu> * add pep8speaks (#842) * advanced profiler describe + cleaned up tests (#837) * add py36 compatibility * add test case to capture previous bug * clean up tests * clean up tests * Update lightning_module_template.py * Update lightning.py * respond lint issues * break long line * break more lines * checkout conflicting files from master * shorten url * checkout from upstream/master * remove trailing whitespaces * remove unused import LightningModule * fix sphinx bot warnings * Apply suggestions from code review just to trigger CI * Update .github/workflows/greetings.yml Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>	2020-02-27 16:07:51 -05:00
Jirka Borovec	d856989120	split trainer tests (#956 ) * split trainer tests * Apply suggestions from code review * format string * add CI timeout	2020-02-26 20:31:40 -05:00
William Falcon	f86dd55145	fixes tpu data loader bug (#957 ) * fixes tpu data loader bug * fixes tpu data loader bug	2020-02-26 19:29:03 -05:00
Ethan Harris	b2e9607362	Refactor dataloading (#955 ) * Refactor dataloading * Refactor dataloading * Refactor dataloading * Add shuffle to test	2020-02-26 16:55:18 -05:00
Hadrien Mary	be244560b2	Callbacks [wip] (#889 ) * Add callback system + associated test * Add trainer and pl_module args to callback methods * typing * typo in docstring * Switch to on_._start() fix on_test_start * fix the mess after rebasing	2020-02-25 23:17:27 -05:00
Ir1dXD	be83e7515b	feat(trainer): add enable_benchmarking option (#803 ) * feat(trainer): add enable_benchmarking option closes #370 * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * add test * try to make the lint work * fix typo * add test, verify torch.backends.cudnn.benchmark * make lint happy * make lint happy Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-02-25 15:05:41 -05:00
Ethan Harris	a5f159b2c7	Add support for multiple loggers (#903 ) * Add support for multiple loggers * Fix PEP * Cleanup * Cleanup * Add typing to loggers * Update base.py * Replace duck typing with isinstance check * Update CHANGELOG.md * Update comet experiment type, Switch to abstractmethod in logging.py * Fix test * Add passes to LightningLoggerBase * Update experiment_logging.rst	2020-02-25 14:52:39 -05:00
Jirka Borovec	5dd2afeab1	Fixing tests (#936 ) * abs import * rename test model * update trainer * revert test_step check * move tags * fix test_step * clean tests * fix template * update dataset path * fix parent order	2020-02-25 13:06:24 -05:00
Adrian Wälchli	20d15c8023	relax hparams (#919 ) relax model loading hparams test wip wip fix warning finish test remove unused import	2020-02-25 10:36:44 -05:00
Chirag Raman	4d36e76cbc	Update tests README to point to tests/requirements.txt (#935 ) * Update tests README Point to tests/requirements.txt as part of instructions * Update `requirements` to `dependencies`	2020-02-25 09:45:34 -05:00
William Falcon	ceec51d96c	fix tests (#938 ) * fix tests * fix tests	2020-02-25 08:53:33 -05:00
Matt Painter	6b667b1237	Fix/test pass overrides (#918 ) * Fix test requiring both test_step and test_end * Add test Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-02-24 22:33:11 -05:00
William Falcon	1015a00506	Clean up dataloader logic (#926 ) * added get dataloaders directly using a getter * deleted decorator * added prepare_data hook * refactored dataloader init * refactored dataloader init * added dataloader reset flag and main loop * added dataloader reset flag and main loop * added dataloader reset flag and main loop * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixes #909 * fixes #909 * bug fix * Fixes #902	2020-02-24 22:23:25 -05:00
Matt Painter	6e7dc9c236	Fixes resuming checkpoints rerunning last epoch (#866 ) * Properly restore current epoch and global step on resume * Add test * Move increment to saving rather than loading * Fix other tests that refer to current epoch * Formatting * Add warning for mid-epoch resuming * Formatting * Fix warning check for accumulated batches * Add variable to init * Formatting * Add check for 0 training steps * Make check more readable	2020-02-21 20:27:19 -05:00
Aljoscha Steffens	9eb1907151	separate requirements for logger dependencies (#792 ) * added file that contains information on the minimal versions needed for the supported loggers * copied minimal version, combined files, deleted duplicates * sorted functions in tests/test_loggers.py to be consistent * expanded wandb logging test; added minimal versions for requirements-extra.txt; increased the amount of training data that is used for tests * formatting * added requirements-extra.txt to MANIFEST.in * reverted wandb test; ensured minimal version for dependencies in requirements-extra.txt in ci-testing.yml	2020-02-21 13:30:27 -05:00
Jeremy Jordan	ea8878bc14	clean up tests/test_profiler.py (#867 ) * cleanup docstrings, _get_total_cprofile_duration in module * relax profiler overhead tolerance	2020-02-19 07:09:28 -05:00
Nicki Skafte	ffd6e693de	new way of passing dataloaders (#759 ) * new way of passing dataloaders * fixed docs * fixed codestyle to follow flake8 * allow val/test be list of dataloaders and smarter checking * added test * fix flake error * fix linking to new test model * split into multiple test * fix naming and typo * minor documentation changes * remove random file * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * better error/warning message * final adjustments * update CHANGELOG.md Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-02-19 06:00:08 -05:00
Peter Izsak	054a35312d	Added max number of steps in Trainer (#728 ) * Added max number of steps in Trainer * Added docstring * Fix flake8 errors * Clarified docstrings * Fixed flake8 error * Added min_steps to Trainer * Added steps and epochs test * flake8 * minor fix * fix steps test in test_trainer * Split steps test into 2 tests * Refactor steps test * Update test_trainer.py * Minor in test_trainer.py * Update test_trainer.py * Address PR comments * Minor Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-02-18 11:23:22 -05:00
William Falcon	d4a31f02e0	Enable TPU support (#868 ) * added tpu docs * added tpu flags * add tpu docs + init training call * amp * amp * amp * amp * optimizer step * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * fix test pkg create (#873) * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Luis Capelo <luiscape@gmail.com> * Fix segmentation example (#876) * removed torchvision model and added custom model * minor fix * Fixed relative imports issue * Fix/typo (#880) * Update greetings.yml * Update greetings.yml * Changelog (#869) * Create CHANGELOG.md * Update CHANGELOG.md * Update CHANGELOG.md * Update PULL_REQUEST_TEMPLATE.md * Update PULL_REQUEST_TEMPLATE.md * Add PR links to Version 0.6.0 in CHANGELOG.md * Add PR links for Unreleased in CHANGELOG.md * Update PULL_REQUEST_TEMPLATE.md * Fixing Function Signatures (#871) * added tpu docs * added tpu flags * add tpu docs + init training call * amp * amp * amp * amp * optimizer step * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Luis Capelo <luiscape@gmail.com> Co-authored-by: Akshay Kulkarni <akshayk.vnit@gmail.com> Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk> Co-authored-by: Shikhar Chauhan <xssChauhan@users.noreply.github.com>	2020-02-17 16:01:20 -05:00
Vadim Bereznyuk	edd4a87fb0	Refactor callbacks (#776 ) * Refactor callbacks * flake8 * Update docstrings * Simplified callback, protected trainer * .set_trainer() check * update docs * missed super().__ini__() * Updated tests * Use uppercase * refine checkpoint callback tests * Added test_begin() and test_end()	2020-02-16 00:03:05 -05:00
Jeremy Jordan	4ae31cd1d5	advanced profiler describe + cleaned up tests (#837 ) * add py36 compatibility * add test case to capture previous bug * clean up tests * clean up tests	2020-02-15 23:43:43 -05:00
Dmitry Lipin	06ca6428b6	Allow user to specify 'step' key while logging metrics (#808 ) * allow to specify 'step' key * add test * docs to log_metrics * fix test * rename * also rename	2020-02-15 23:35:23 -05:00
Jirka Borovec	9f939447f2	add autopep8 to Contributions guide (#852 ) * add autopep8 to Contrib. * simplify cmd * update GH templates * add pytest-flake8 * update GH template	2020-02-15 20:24:38 -05:00
Jirka Borovec	af44583050	drop torchvision, tests only (#797 ) * drop torchvision, tests only * manifest * move test utils	2020-02-10 22:47:18 -05:00
Bob Kemp	8fa802e35b	Tensorboard path generalisation (#804 ) * Allow experiment versions to be overridden by passing a string value. Allow experiment names to be empty, in which case no per-experiment subdirectory will be created and checkpoints will be saved in the directory given by the save_dir parameter. * Document tensorboard api changes * Review comment fixes plus fixed test failure for minimum requirements build * More format fixes from review	2020-02-10 09:07:17 -05:00
Jirka Borovec	fc0ad03008	fix test for profiler (#800 ) * fix test for profiler * use allclose * user relative tol	2020-02-09 17:48:37 -05:00
Jeremy Jordan	1cf430f7bc	new feature for profiling training runs (#782 ) * initial implementation * formatting, pass through profiler, docstring * call profiler during training * add initial tests * report stats when training is done * fix formatting * error handling, bugfix in passthroughprofiler * finish documenting profiler arg in Trainer * relax required precision for profiling tests * option to dump cProfiler results to text file * use logging, format with black * include profiler in docs * improved logging and better docs * appease the linter * better summaries, wrapper for iterables * fix typo * allow profiler=True creation * more documentation * add tests for advanced profiler * Update trainer.py * make profilers accessible in pl.utilities * reorg profiler files * change import for profiler tests Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-02-06 22:01:21 -05:00
Adrian Wälchli	472f394788	Resolve some codefactor issues (#756 ) * remove unnecessary pass statements * use isinstance for type checks * remove unnecessary else/elif after return * remove unnecessary return statements * move doc string to top * merge isinstance calls * remove unnecessary else/elif after raise * use list comprehension * do not use len without comparison * add missing shebang * revert isinstance check back to type broke tests, because bool is actually subclass of int * add missing period to doc string * remove unnecessary pass statements * use isinstance for type checks * remove unnecessary else/elif after return * remove unnecessary return statements * move doc string to top * merge isinstance calls * remove unnecessary else/elif after raise * use list comprehension * do not use len without comparison * add missing shebang * revert isinstance check back to type broke tests, because bool is actually subclass of int * add missing period to doc string * Fix default ckpt path when logger exists (#771) * rename logging -> loggers (#767) * move logging >> loggers * add warning * fix tests * logging alias * formatting * formatting * use isinstance for type checks * revert isinstance check back to type broke tests, because bool is actually subclass of int * add more detail to tbptt example (#755) * add more detail to tbptt example * warn user about new arg in training_step Co-authored-by: Vadim Bereznyuk <kuynzereb@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>	2020-02-01 18:44:05 -05:00
Jirka Borovec	76a1c67d87	rename logging -> loggers (#767 ) * move logging >> loggers * add warning * fix tests * logging alias * formatting * formatting	2020-02-01 15:47:58 -05:00
Vadim Bereznyuk	50881c0b31	Check early stopping metric in the beginning of the training (#542 ) * Early stopping fix * Update trainer.py * Don't force validation sanity check * fix tests * update * Added early_stopping check_metrics * Updated docs * Update docs * Do not call early stopping when validation is disabled Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-01-23 11:12:51 -05:00
Nic Eggert	dfb6d3626e	Fix failing GPU tests (#722 ) * Fix distributed_backend=None test We now throw a warning instead of an exception. Update test to reflect this. * Fix test_tube logger close when debug=True	2020-01-21 14:26:43 -05:00
William Falcon	9e654c4ec8	Update requirements.txt	2020-01-21 08:11:22 -05:00
Jirka Borovec	ea59a99426	update org paths & convert logos (#685 ) * fix typos * update org paths * update links from READMe to docs * add svg logo * add svg logo-text * update logos * testing temp paths * prune links from readme * optimize imports * update logo * update paths in README * missing imports	2020-01-20 14:50:31 -05:00
Z ZH	de2ccc03a8	add version_ prefix to log_dir (#706 ) * add version_ prefix to log_dir * add version_ prefix	2020-01-18 07:17:53 -05:00
William Falcon	bc67689068	clean v2 docs (#691 ) * updated gitignore * Update README.md * updated gitignore * updated links in ninja file * updated docs * Update README.md * Update README.md * finished callbacks * finished callbacks * finished callbacks * fixed left menu * added callbacks to menu * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * fixing TensorBoard (#687) * flake8 * fix typo * fix tensorboardlogger drop test_tube dependence * formatting * fix tensorboard & tests * upgrade Tensorboard * test formatting separately * try to fix JIT issue * add tests for 1.4 * added direct links to docs * updated gitignore * updated links in ninja file * updated docs * finished callbacks * finished callbacks * finished callbacks * fixed left menu * added callbacks to menu * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * finished rebase * making private members * making private members * making private members * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * set auto dp if no backend * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * fixed lightning import * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * finished lightning module * finished lightning module * finished lightning module * finished lightning module * added callbacks * added loggers * added loggers * added loggers * added loggers * added loggers * added loggers * added loggers * added loggers * set auto dp if no backend * added loggers * added loggers * added loggers * added loggers * added loggers * added loggers * flake 8 * flake 8 Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-01-17 06:03:31 -05:00
Jirka Borovec	bde549cb36	unify model test acc (#696 )	2020-01-17 05:50:26 -05:00
Jirka Borovec	f72e354ee6	fixing TensorBoard (#687 ) * flake8 * fix typo * fix tensorboardlogger drop test_tube dependence * formatting * fix tensorboard & tests * upgrade Tensorboard * test formatting separately * try to fix JIT issue * add tests for 1.4	2020-01-16 07:22:29 -05:00
Boris Dayma	ec7fc97857	Feature: wandb logger (#627 ) * Basic wandb support * refactor(wandb): remove unused variables and document logger * docs(wandb): explain how to use WandbLogger * test(wandb): add tests for WandbLogger * feat(wandb): add save_dir * fix(wandb): allow pickle of logger * fix(wandb): save logs in custom directory * test(wandb): test import * docs(wandb): simplify docstring and use doctest * test: increase number of epochs for satisfactory accuracy * test(test_load_model_from_checkpoint): ensure we load last checkpoint Co-authored-by: Chris Van Pelt <vanpelt@wandb.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-01-13 22:25:27 -05:00
Jirka Borovec	f7db44e750	fix deprecated tng and abstract ligntning (#644 )	2020-01-13 22:20:38 -05:00
Jakub	8dc8a8bfd3	Neptune integration (#648 ) * added neptune integration * added tests for NeptuneLogger, added neptune to docs * updated link to neptune support * fixed docstrings, fixed try/except in tests, changed append_tags input * fixed docstrings line lenght * bumped epoch nr in model restore tests * added tags support for single strings * fixed passing neptune token to backend * fixed project name in offline mode * added save_top_k=-1 to checkpoint callback * reformated initialization of neptune in online mode * bumped epoch nr to 4 in test_load_model_from_checkpoint * bumped epoch nr to 5 Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-01-13 22:20:01 -05:00
Jirka Borovec	db6b404748	CI pass (#671 ) * fix pillow in test * test acc * update version in deprecated msg	2020-01-13 22:09:47 -05:00
Vadim Bereznyuk	12edc3099c	Fix the number of training batches used in the training loop (#653 ) * Fix the number of processed training batches * Fix tests * fix tests * fix tests * One more attempt * Fix another test	2020-01-05 14:37:09 -05:00
Nic Eggert	019f612204	Fix amp tests (#661 ) * Run AMP tests in their own process With opt_level="O1" (the default), AMP patches many torch functions, which breaks any tests that run afterwards. This patch introduces a pytest extension that lets tests be marked with @pytest.mark.spawn so that they are run in their own process using torch.multiprocessing.spawn so that the main python interpreter stays un-patched. Note that tests using DDP already run AMP in its own process, so they don't need this annotation. * Fix AMP tests Since AMP defaults to O1 now, DP tests no longer throw exceptions. Since AMP patches torch functions, CPU inference no longer works. Skip prediction step for AMP tests. * typo	2020-01-05 14:34:25 -05:00
Jirka Borovec	5d00e62047	Fix logger, tensorboard (#610 ) * fix logger tests * fix missing flush * fix tensorboard * fix namespace * fix flush * fix add_hparams	2019-12-08 07:59:25 -08:00
Nic Eggert	5329c72cb0	Implement TensorboardLogger (#607 ) * Implement TensorboardLogger * Pass default_save_path to trainers * Update tensorboard.py	2019-12-07 23:25:37 -05:00
Jirka Borovec	4970624f8b	fix Logger tests for Win (#605 ) * fix mlflow test * fix mlflow test * update logger / mlflow * flake8 * fix appveyor	2019-12-07 19:25:12 -05:00
schwobr	2f01c03b38	Additional hooks (#598 ) * Renamed `on_sanity_check_start` to `on_train_start` and added `on_train_end` to `ModelHooks` * changed tests to use `on_train_start` instead of `on_sanity_check_start`	2019-12-07 08:52:06 -05:00
Elliot Waite	1051c189e1	Simplify variables: step, epoch, max_epochs, min_epochs (#589 )	2019-12-07 08:50:21 -05:00
Adrian Wälchli	f7e1040236	Docs and Tests for "gpus" Trainer Argument (#593 ) * add table for gpus argument * fix typo in error message * tests for supported values * tests for unsupported values * fix typo * add table for gpus argument * fix typo in error message * tests for supported values * tests for unsupported values * fix typo * fix typo list->str * fix travis warning "line too long"	2019-12-07 08:48:45 -05:00
Nic Eggert	0489e31b02	Fix CometML tests (#585 ) * monkeypatch atexit.register to fix problem with cometml logging * Use experiment id for version in cometml	2019-12-07 00:24:59 -05:00
Jirka Borovec	1d4b6be17b	rename trainer modules, drop `_mixin` (#571 ) * rename trainer modules, drop _mixin * fix imports	2019-12-04 11:39:14 -05:00
Jirka Borovec	3a58937d8b	rename variables nb -> num (#567 ) * rename nb -> num * flake8 * batch_nb, epoch_nb, gpu_nb, split_nb * add _num deprecations	2019-12-04 06:57:10 -05:00
Jirka Borovec	63717e8fda	prune tests (#564 ) * format docstring in tests * prune unused vars * optimize imports * drop duplicated var	2019-12-04 06:48:53 -05:00
Nic Eggert	62f6f92fdf	Use pytest tmpdir fixture (#482 ) * Use pytest tmpdir * Switch to tmpdir fixtures * Switch to tmpdir fixture * tmpdir fixture * Fix more conflicts	2019-12-03 08:01:04 -05:00
Jirka Borovec	47659daa5f	speed-up testing (#504 ) * extend CI timeout * add short MNIST * lower dataset and stop thr * refactor imports * formatting * early stop * play params * play params * minor refactoring # Conflicts: # pytorch_lightning/testing/__init__.py # pytorch_lightning/testing/lm_test_module.py # pytorch_lightning/testing/lm_test_module_base.py # pytorch_lightning/testing/lm_test_module_mixins.py # pytorch_lightning/testing/model.py # pytorch_lightning/testing/model_base.py # pytorch_lightning/testing/model_mixins.py # pytorch_lightning/testing/test_module.py # pytorch_lightning/testing/test_module_base.py # pytorch_lightning/testing/test_module_mixins.py * typo Co-Authored-By: Ir1dXD <sirius.caffrey@gmail.com> * Revert "refactor imports" This reverts commit `b86aee92` * update imports	2019-11-28 12:06:05 -05:00
Jirka Borovec	9785a3e78e	Refactor: name modules (#548 ) * refactor: rename some modules * add deprecation warnings * fix paths	2019-11-26 22:39:18 -05:00
Ir1dXD	7324dd902b	change Checkpoint callback's `save_best_only` to `save_top_k` (#128 ) * docs: enable syntax highlight * feat: change Checkpoint callback's `save_best_only` to `save_top_k` fix #70 * docs: update docs for save_top_k * revert other files * style: lint for travis-ci * fix typo * make flake8 happy * update according to review * add tests * rename func to private * add doc on `save_top_k == 0` * make flake8 happy * update according to PR comments * change some f-strings * Update pt_callbacks.py * Update test_models.py * update options * create folders * Update test_models.py * change epoch num * support calling multiple times, add docs and tests * update docs * roll back changes in earlystopping * clean test files * make flake8 happy * fix epoch number * update tests about epoch numbers * clean debugging code * fix testing utils codes * fix testing utils codes * fix testing utils codes * fix testing utils codes * change save_dir to tests/tests according to previous lines * remove unused overwrite option * make flake8 happy * change var name as per review * make flake8 happy * update property name to work on master * elaborate in the docs * update docs as per review * revert previous commit accidentally pressed wrong button when solving conflicts	2019-11-19 15:43:34 -08:00
rwesterman	d1b6b011c3	Comet fix (#481 ) * Fixing comet ml bug and adding functionality * Updating documents * Fixing code style issues in comet_logger * Changing comet_logger experiment to execute lazily * Adding tests for comet_logger and addressing comments from @Borda * Setting step_num to optional keyword argument in log_metrics() to comply to other loggers * Adding offline logging mode for comet_ml, updating tests and docs * Switching to MisconfigurationException	2019-11-11 23:00:31 -05:00
Jirka Borovec	1fd1e42aa6	Fix setup-doc for pypi (#472 ) * add Twine to CI * freeze Twine * freeze Twine * minor refactoring * try another * fix req. * update README * fix __doc__ * fix multiple req. test-tube	2019-11-09 00:59:14 -05:00
Nic Eggert	9fa2806605	Fix ModelCheckpoint default paths (#413 ) * Make name and version properties required * Warn before deleting files in checkpoint directory * Get default checkpoint path from any logger * Fix typos * Uncomment logger tests * Whitespace * Update callback_config_mixin.py checkpoints and version file names would just have a number. it's easy to tell what you're looking at with version_ prepended * Address comments * Fix broken tests	2019-11-05 10:41:59 -05:00
Yongrae Jo	32dd803b1e	Fix min_max gpu memory logging bug (#453 ) * #452 Fix ValueError * #452 Use subprocess.run * #452 Simplify code for gpu_memory_map * #452 Simplify code for min max memory * #452 Add test for get_memory_profile * #452 Use os.sep * #452 Use os.linesep	2019-11-05 08:55:44 -05:00
Ir1dXD	5a9afb11cc	change print to logging (#457 ) * change print to logging * always use logging.info * use f-strings * update code style * set logging configs * remove unused code	2019-11-05 08:43:21 -05:00
William Falcon	37729f0a17	fixing test (#451 )	2019-11-03 08:52:22 -05:00
Tullie Murrell	248495b1d1	Add tbptt (#429 ) * Add truncated bptt * Fix rebase error * AutoPep8 * Address comments, incl default bptt_split impl * Add tbptt test * Add default split for lists/tuples * Add tbptt docs * Fix trainer spacing * Update RequiredTrainerInterface.md	2019-10-31 06:45:28 -04:00

1 2 3 4 5 ...

505 Commits