lightning

Commit Graph

Author	SHA1	Message	Date
Jirka Borovec	b7d72706c3	clean imports (#2867 ) * clean imports * miss	2020-08-08 00:33:51 +02:00
William Falcon	f82d7feb6c	updated hooks (#2850 ) * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks	2020-08-07 09:29:57 -04:00
Jirka Borovec	0fe933e23d	fixing TPU tests (#2632 ) * init * rename * tpu_core_idx * idx 8 * idxs * @pl_multi_process_test * assert * assert * deamon * no close * imort * msg * use_single_gpu * dataset * idx * fix idx * dataset * format * add pickable * typo * apex * typo * wip * wip * wip * wip * wip * wip * wip * wip * docs * typo * tests * tests * tests * tests * tests * tests * tests * tests * tests * tests * tests * tests * tests * tests * tests * tests * tests * docs * docs * Apply suggestions from code review Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions from code review Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk> * docs * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2020-07-27 19:07:09 -04:00
Rohit Gupta	84c507c4df	Fix max_batches with fast_dev_run. (#2581 ) * Fix fast_dev_run to run for all val_dataloaders * fast_dev_run check * changelog * explicit * limit_batches with fast_dev_run in init * add test * whitespace and comment fix * comment and assertion * added tests * Fix fast_dev_run to run for all val_dataloaders * fast_dev_run check * changelog * explicit * limit_batches with fast_dev_run in init * add test * whitespace and comment fix * comment and assertion * added tests * added tests * added tests * added tests * update rtol * Revert "update rtol" This reverts commit `4320329540`. * added tests Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-07-27 17:56:55 -04:00
Nathan Raw	9076551aec	Enable val/test loop disabling + datamodule tests (#2692 ) * 🎨 warn instead of error out on loaders * 🐛 test misconfiguration should still fail * 🚧 . * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj * updated docs with new result obj Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-07-25 12:57:40 -04:00
Adrian Wälchli	1e68968ed7	support num_sanity_val_steps=-1 (#2246 ) * support sanity_val_step=-1 * fix list size * simplification * simplify * add test for num_sanity_val_steps=-1 * update test * update docs * extend tests to multiple dataloaders * changelog * Update tests/trainer/test_trainer.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * improve test * refactor the sanity check decision * fix merge * Update trainer.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-07-23 07:07:03 -04:00
William Falcon	62ce00f96c	EvalResult support for val loop (PR 3/5) (#2651 ) * add EvalResult to support to val/test loops	2020-07-22 13:53:10 -04:00
William Falcon	aaa1553e35	tests for val loop flow (#2605 ) * add tests for single scalar return from training * add tests for single scalar return from training * add tests for single scalar return from training * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only	2020-07-14 14:20:45 -04:00
William Falcon	e068af9ea8	Ampt (#2572 ) * remove grad scaling tpu * remove grad scaling tpu * remove grad scaling tpu * remove grad scaling tpu * remove grad scaling tpu * remove grad scaling tpu * remove grad scaling tpu * remove grad scaling tpu * remove grad scaling tpu	2020-07-09 21:28:11 -04:00
William Falcon	11069c8784	Fix ddp tests + .test() (#2512 ) * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * fix deprecation warnings * added base tests for tpu * added base tests for tpu * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu Co-authored-by: Jirka <jirka@pytorchlightning.ai> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>	2020-07-07 12:24:56 -04:00
William Falcon	afdfba1dc6	removed auto val reduce (#2462 )	2020-07-02 07:04:18 -04:00
William Falcon	325852c6df	enabled no returns from eval (#2446 ) * enabled no returns from eval * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs	2020-07-01 07:38:00 -04:00
William Falcon	309ed75c5d	added reduce ddp results on eval (#2434 ) * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval	2020-06-30 16:15:35 -04:00
Jirka Borovec	0be78d13aa	native amp (#2373 ) * native amp * typo * imports * apex	2020-06-26 21:45:13 -04:00
Jirka Borovec	f1c96930b1	repair CI for Win (#2358 ) * no cov * no cov * ReduceOp * group * reduce_op.sum * Update sklearns.py * formatting * horovod * Apply suggestions from code review * horovod * horovod * horovod * horovod * ci * print * ci * timeout * timeout * time * fix * distributed cpu * pipes * time * cpu * spawn * spawn * spawn * tp * separate * os * os * npm * Fix load_from_checkpoint() not working with URL on Windows * Update CHANGELOG * Update CHANGELOG.md Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com> * Apply suggestions from code review * fix * fix meta tags creating empty lines * pyright * node * fix httpserver address * drop tutils.default_trainer_options * imports * Better fix for load_from_checkpoint() not working with absolute path on Windows (#2294) * Fix load_from_checkpoint() not working with URL on Windows * Update CHANGELOG * Update CHANGELOG.md Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com> * drop duplicate Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: airium <airium@outlook.com> Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: AIRIUM <38249940+airium@users.noreply.github.com>	2020-06-26 21:38:25 -04:00
Adrian Wälchli	e085e93dd3	Add missing test for "multiple dataloader + percent_check fix" (#2226 ) * Init fix num_batches * Fix num_batches in case of multiple dataloaders * Apply suggestions from code review * Changes based on suggestions * Flake8 * Add test to check num_batches * generalize dataloader percent check test * fix formatting * remove hparams * tests * CHANGELOG * Update CHANGELOG.md * max_batches can be int * conflict and rebase * add back the test fix fix message 0.0 works Revert "fix message" This reverts commit 839cacf8b8610f4e697e654ef6f3d2501bf23984. * update changelog * Update CHANGELOG.md * Fix num batches in case of multiple dataloaders and percent_check (#1920) * git conflict Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * missing union * doc update suggestion by @rohitgr7 * extend test * changelog * docs add note about multiple loaders * update changelog * remove unused variable Co-authored-by: rohitgr7 <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-06-23 11:21:24 -04:00
Adrian Wälchli	bdee1cd106	update docs for "overfit_batches" (#2324 ) * update docs * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-06-23 11:19:38 -04:00
William Falcon	04c794ca72	[WIP] Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) (#2213 ) * fixed percent check for val/test * fixed percent check for val/test * fixed percent check for val/test * fixed percent check for val/test * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * add on fit_start on fit_end hooks * add on fit_start on fit_end hooks * add on fit_start on fit_end hooks Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-06-17 08:03:28 -04:00
William Falcon	5fd01b0e68	Finish Ananthsub patch 1 (enable prepare_data from correct processes). clarify local vs global rank (#2166 ) * [trainer] Call prepare_data once per node in DDP/DDP2 training * refactored DDP routes * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * renamed proc_rank to local_rank * spawn message * spawn message * spawn message * fixes * fixes * fixes * fixes * fixes * Update trainer.py Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>	2020-06-13 12:00:14 -04:00
Udit Arora	08573d0f7e	Fix some pyright member access errors in training module (#2121 ) * Fix pyright member access errors in training module * Fix Trainer instantiation error due to inheritence order * Add GH workflow for pyright * Fix more pyright errors in trainer module * Add pyrightconfig and setup python environment in type-check workflow * Exclude pyrightconfig.json * suggestions Co-authored-by: Jirka <jirka@pytorchlightning.ai>	2020-06-12 17:23:18 +02:00
Adrian Wälchli	8211256c46	data transfer model hook (+ refactor) (#1756 ) * refactor and added hook variant a variant b add test revert rename add changelog docs * resolve merge duplication * overridden typo * fix test * tpu id * raise if TPU not available * re-use apply_to_collection function for parsing collections * comment * make utility function available to user * documentation * move changelog entry to top * fix tpu transfer call * fix call * remove hardcoded string * improve test * call model hook by default * Apply suggestions from code review * rename utility function Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-06-02 21:45:19 -04:00
Lezwon Castelino	943c4b20af	slow tpu train (#2033 ) * use parallel loader * Revert "use parallel loader" This reverts commit ed6e7583 * select tpu id for pl * condition if tpu_id is None * added info to changelog * Revert "condition if tpu_id is None" This reverts commit `1fb6e586` * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-06-02 18:48:05 -04:00
Lezwon Castelino	7c7e50ca47	Allow user to select individual TPU core to train on (#1729 ) * added tpu_id added tpu_id to mixins * train on individual tpu * parallel loader if tpu_id is None * removed progress_bar_refresh_rate * chlog * replaced num_tpu_cores with tpu_cores * set tpu_id to None if int * changed num_tpu_cores to tpu_cores in docs * updated docs * updated __init__.py removed self.tpu_id for ParallelLoader * Update pytorch_lightning/trainer/__init__.py * check if tpu_cores is a list Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * xla device conditional * num_tpu_cores deprecation * removed duplicate warning * fixed pep8 error * Revert "removed duplicate warning" This reverts commit `8adb0a9b` * deprecated api update * fixed recursion error * fixed tests * fixed flake errors * removed current_tpu_index * Update CHANGELOG.md * Update trainer.py Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-05-17 16:30:54 -04:00
Rohit Gupta	56d521a317	Fix test configuration check and testing (#1804 ) * Fix test configuration check and testing * Fix test configuration check and testing * Remove check_testing_configuration during test * Fix docstring * fix function name * remove conflicts	2020-05-17 08:22:44 -04:00
So Uchida	22d7d03118	Replace meta_tags.csv with hparams.yaml (#1271 ) * Add support for hierarchical dict * Support nested Namespace * Add docstring * Migrate hparam flattening to each logger * Modify URLs in CHANGELOG * typo * Simplify the conditional branch about Namespace Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update CHANGELOG.md Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * added examples section to docstring * renamed _dict -> input_dict * mata_tags.csv -> hparams.yaml * code style fixes * add pyyaml * remove unused import * create the member NAME_HPARAMS_FILE * improve tests * Update tensorboard.py * pass the local test w/o relavents of Horovod * formatting * update dependencies * fix dependencies * Apply suggestions from code review * add savings * warn * docstrings * tests * Apply suggestions from code review * saving * Apply suggestions from code review * use default * remove logging * typo fixes * update docs * update CHANGELOG * clean imports * add blank lines * Update pytorch_lightning/core/lightning.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update pytorch_lightning/core/lightning.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * back to namespace * add docs * test fix * update dependencies * add space Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-05-13 15:05:15 +02:00
William Falcon	eeb411144f	enable fast_dev_run without a validation loop (#1779 ) * fix val dataloader * Update evaluation_loop.py	2020-05-11 11:30:22 -04:00
Shunta Komatsu	f656882942	Fix typo (#1750 )	2020-05-07 09:25:54 -04:00
Adrian Wälchli	3e8f2d99a9	Progress bar callback (#1450 ) * squash and rebase sanity check hooks sanity check callback hook finish moved core progress bar functionality into callback wip remove duplicate merge clean up imports docs sanity check progress bar main sanity move callback calls init progrss bar callback configuration and docs changelog rate decorator pass process_position disable on rank > 0 position index is_enabled remove decorator refactor init tqdm bars callback method ordering cannot reset when disabled sequence -> list default values fix has no attr _time() move on_val_end to proper place fix the pickle issue update warning properties check for None remove old comment switch order pull out non-tqdm functionality into base class documentation for the base class docs fix refresh rate issue in validation restrict type hint of trainer arg more docs update trainer docs rst docs fix lines too long fix test add missing type hints fix typo move docstring to __init__ solves doctest failures remove doctest :(( can't fix the pickle error fix example simplify by saving trainer reference fix docs errors move docstring initial value multiple val checks per epoch simpler handling of inf dataset sizes update inf docs renamed training_tqdm_dict rename get_tqdm_dict rename occurences of tqdm update changelog fix doctest fix formatting errors added callback tests progress bar on off test more tests for progress bar weird test fix? add ignored property disable default progress bar in LR finder change enable/disable behavior trying doctest in CI again undo doctest pickle error undo doctest pickle error :(( remove progress_bar_callback Trainer arg and fix tests restore progress bar after auto lr find update docs fix rebase fix wrong negation * fix fast dev run total * more thorough testing * remove old args * fix merge * fix merge * separate tests * type hint total batches * reduce if Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * is_disabled Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * is_enabled Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * rename enabled/disabled * move deprecated api * remove duplicated test from merge * fix rename is_disabled * newline * test also testprogress for fast dev run Co-authored-by: J. Borovec <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-23 20:46:18 -04:00
Jirka Borovec	7989ca844c	test deprecation warnings (#1470 ) * check deprecation warnings * extend warning test * try * unimport modules * update	2020-04-23 17:34:47 -04:00
William Falcon	29ebe92208	support for native amp (#1561 ) * adding native amp suppport * adding native amp suppport * adding native amp suppport * adding native amp suppport * autocast * autocast * autocast * autocast * autocast * autocast * removed comments * removed comments * added state saving * added state saving * try install amp again * added state saving * drop Apex reinstall Co-authored-by: J. Borovec <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-23 14:47:08 -04:00
Travis Addair	7024177f7d	Added Horovod distributed backend (#1529 ) * Initial commit of Horovod distributed backend implementation * Update distrib_data_parallel.py * Update distrib_data_parallel.py * Update tests/models/test_horovod.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/models/test_horovod.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Fixed tests * Added six * tests * Install tox for GitHub CI * Retry tests * Catch all exceptions * Skip cache * Remove tox * Restore pip cache * Remove the cache * Restore pip cache * Remove AMP Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-22 17:39:08 -04:00
William Falcon	ae2e14e3ed	fixed memory leak from opt return (#1528 ) * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return	2020-04-19 16:41:54 -04:00
William Falcon	3431c62d41	Remove error when test dataloader used in test (#1495 ) * remove error when test dataloader used in test * remove error when test dataloader used in test * remove error when test dataloader used in test * remove error when test dataloader used in test * remove error when test dataloader used in test * remove error when test dataloader used in test * fix lost model reference * remove error when test dataloader used in test * fix lost model reference * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * moved optimizer types * added tests for warning * fix lost model reference * fix lost model reference * added tests for warning * added tests for warning * refactoring * refactoring * fix imports * refactoring * fix imports * refactoring * fix tests * fix mnist * flake8 * review Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-15 22:16:40 -04:00
Jirka Borovec	b3fe17ddeb	fix flushing loggers (#1459 ) * flushing loggers * flushing loggers * flushing loggers * flushing loggers * changelog * typo * fix trains * optimize imports * add logger test all * add logger test pickle * flake8 * fix benchmark * hanging loggers * try * del * all * cleaning	2020-04-14 20:32:33 -04:00
William Falcon	1f685c2882	fix pretty print (#1441 ) * grid sample * grid sample * grid sample * grid sample * grid sample * changelog * version Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-10 08:43:22 -04:00
Jirka Borovec	17f58d2e11	add rank warning (#1428 ) * add rank warning * changelog * use rank_zero_warn * user trainer_init * replace warnings * fix test * flake8 * docs * changelog * bug lol	2020-04-09 14:05:46 -04:00
vguizilini	2ae2bd2b46	Print test results only if prog_bar_metrics is not empty (#1411 ) * Print test results only if prog_bar_metrics is not empty * Update evaluation_loop.py Co-authored-by: vitor-guizilini <vitor.guizilini@tri.global> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-08 11:51:52 -04:00
Adrian Wälchli	ebd9fc9530	Fix for incorrect run on the validation set with overwritten validation_epoch_end and test_end (#1353 ) * reorder if clauses * fix wrong method overload in test * fix formatting * update change_log * fix line too long	2020-04-03 09:25:32 -04:00
Gerard Bentley	f33b5a8d99	Simplify progress bar args (#1108 ) * show progress bar dependent on refresh_rate * test progress_bar_refresh control show bar * remove show_progress_bar from other tests * borda fixes * flake8 fix * changelog update prog bar refresh rate * move show_progress_bar to deprecated 0.9 api * rm show_progress_bar references, test deprecated * Update pytorch_lightning/trainer/__init__.py * fix test * changelog * minor CHANGELOG.md format * Update pytorch_lightning/trainer/__init__.py * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Gerard Bentley <gbkh2015@mymail.pomona.edu> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-03 00:53:00 +02:00
Teven	04935ea718	fixed extra dataloader bug (#1196 ) * fixed extra dataloader bug * Update pytorch_lightning/trainer/training_loop.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * updated CHANGELOG * Small non-repetition change self.get_model() => model as it was already defined * Update CHANGELOG.md * changed argument name to reload_train_dataloader_every_epoch * fixed doc underline too short * reverted to `reload_dataloaders_every_epoch` * fixed val and test reloading * fixed val and test reloading Co-authored-by: TevenLeScao <teven.lescao@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-02 11:41:56 +02:00
Jirka Borovec	6ddb03922a	Profiler summary (#1259 ) * refactor and add types * add Prorfiler summary * fix imports * Revert "refactor and add types" This reverts commit b4c552fa * changelog * revert rename * fix test * mute verbose	2020-03-31 08:57:48 -04:00
Jirka Borovec	09167efdb5	Checkpointing interval (#1272 ) * formatting * formatting * fix interval * fix train loop * fix test * parametrize test * Apply suggestions from code review Co-Authored-By: Adrian Wälchli <adrian.waelchli@students.unibe.ch> * fix calling * flake8 * add types Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-30 18:37:02 -04:00
Jirka Borovec	c869dd8b8f	make evaluate private (#1260 ) * make evaluate private * changelog	2020-03-30 12:14:27 -04:00
So Uchida	60b8246bc3	Pretty test results with pprint (#1176 )	2020-03-24 14:52:57 -04:00
Jirka Borovec	3be81cb54e	test deprecated - model (#1074 ) * pylint * model API * update test * formatting * disable logger * fix checking overwrite * fix test * typo * deprecated model * fix for DDP * drop Flake8 in GH actions * Update pytorch_lightning/trainer/evaluation_loop.py * fix imports Co-authored-by: Nic Eggert <nic@eggert.io>	2020-03-20 20:51:14 +01:00
Adrian Wälchli	792962ecc9	CI: Force docs warnings to be raised as errors (+ fix all) (#1191 ) * add argument to force warn * fix automodule error * fix permalink error * fix indentation warning * fix warning * fix import warnings * fix duplicate label warning * fix bullet point indentation warning * fix duplicate label warning * fix "import not top level" warning * line too long * fix indentation * fix bullet points indentation warning * fix hooks warnings * fix reference problem with excluded test_tube * fix indentation in print * change imports for trains logger * remove pandas type annotation * Update pytorch_lightning/core/lightning.py * include bullet points inside note * remove old quick start guide (unused) * fix unused warning * fix formatting * fix duplicate label issue * fix duplicate label warning (replaced by class ref) * fix tick * fix indentation warnings * docstring ticks * remove obsolete docstring typing * Revert "remove old quick start guide (unused)" This reverts commit `d51bb40695`. * added old quick start guide to navigation * remove unused tutorials file * ignore some modules that got deprecated and are not used anymore * fix duplicate label warning * move examples doc and exclude pl_examples from autodoc * fix formatting for configure_optimizer * fix no blank line warnings * fix "see also" labels and add paramref extension * fix more reference problems * fix multi-gpu reference * fix weird warning * fix indentation and unrecognized characters in code block * fix warning "... not included in toctree" * fix PIL import error * fix duplicate target "here" warning * fix broken link * revert accidentally moved pl_examples * changelog * stdout * note some things to know Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: J. Borovec <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-20 20:49:01 +01:00
Jirka Borovec	22a7264e9a	improve partial Codecov (#1172 ) * ignore in setup * show report * abs imports * abstract pass * cover loggers * doctest trains * locals * pass * revert tensorboard * use tensorboardX * revert tensorboardX * fix trains * Add TrainsLogger.set_credentials (#1179) * Add TrainsLogger.set_credentials to control trains server configuration and authentication from code. Sync trains package version. Fix CI Trains tests * Add global TrainsLogger set_bypass_mode (#1187) * Add global TrainsLogger set_bypass_mode skips all external communication Co-authored-by: bmartinn <> * rm some no-cov Co-authored-by: Martin.B <51887611+bmartinn@users.noreply.github.com>	2020-03-19 09:14:29 -04:00
Ethan Harris	2b3f443f6b	Add support for IterableDatasets everywhere (#1104 ) * Add support for IterableDatasets everywhere * Added type hints, simplified code and improved coverage in data_loading.py * Update CHANGELOG.md	2020-03-12 12:46:02 -04:00
Jirka Borovec	514d182b7f	cleaning imports (#1032 )	2020-03-12 12:41:37 -04:00
William Falcon	29faea1862	Steps (#1051 ) * training_end renamed to training_step_end * training_end renamed to training_step_end * training_end renamed to training_step_end * training_end renamed to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * fix lost model reference * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end * training_end to training_step_end	2020-03-05 12:32:45 -05:00
William Falcon	bcb45d906d	proper checkpoint implementation (#1043 ) * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * enabled early stopping/checkpooiunt even without val step * name formatting * version * testing * add test * fix test * Update model_checkpoint.py * doctests * pylint * tests * debug * debug * enabled early stopping/checkpooiunt even without val step * fix MNIST download (#1044) * fix MNIST download * simple * name formatting * version * testing * add test * fix test * doctests * tests * debug * debug * rebased 1041 * rebased 1041 * tests * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 * rebased 1041 Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-03-04 23:02:19 -05:00
Jeremy Jordan	705e576417	consolidate callbacks and hooks (#950 ) * consolidate callbacks and hooks * ensure callbacks recieve proper arg types * remove model from init callback events * clean up early stopping event * update changelog * remove on_fit_start and on_fit_end * fix args for on_init_start and on_init_end * handle case where early stopping is not used * show all callback methods * wrap checkpoint callback logic into proper class * fix check for main process in checkpoint callback * move callbacks test to separate file * refactor arg checks * get model and call hook on same line * define trainer_options dict in one call * add more asserts to callback test	2020-03-02 23:51:32 -05:00
William Falcon	6dae5698ef	fixes test issues on ddp (#1017 ) * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs * updated checkpoint docs	2020-03-02 21:50:38 -05:00
William Falcon	2a04be0386	No auto load weights (#985 ) * remove autoload * remove autoload * added weights loading docs * checkpoint loading saving docs * checkpoint loading saving docs * checkpoint loading saving docs * docs (#1010) * remove autoload * remove autoload * added weights loading docs * checkpoint loading saving docs * checkpoint loading saving docs * checkpoint loading saving docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs	2020-03-02 17:12:22 -05:00
Jirka Borovec	479a35d94e	fix docs (#982 )	2020-02-28 18:48:07 -05:00
Jirka Borovec	7beed7cae6	Trainer cleanup (#934 ) * Trainer cleanup * update abstract * remove ... * remove __init__ * update mixin types * update callbacks * fix * lower test acc	2020-02-27 16:21:14 -05:00
William Falcon	f86dd55145	fixes tpu data loader bug (#957 ) * fixes tpu data loader bug * fixes tpu data loader bug	2020-02-26 19:29:03 -05:00
Hadrien Mary	be244560b2	Callbacks [wip] (#889 ) * Add callback system + associated test * Add trainer and pl_module args to callback methods * typing * typo in docstring * Switch to on_._start() fix on_test_start * fix the mess after rebasing	2020-02-25 23:17:27 -05:00
Jirka Borovec	5dd2afeab1	Fixing tests (#936 ) * abs import * rename test model * update trainer * revert test_step check * move tags * fix test_step * clean tests * fix template * update dataset path * fix parent order	2020-02-25 13:06:24 -05:00
William Falcon	ceec51d96c	fix tests (#938 ) * fix tests * fix tests	2020-02-25 08:53:33 -05:00
Matt Painter	6b667b1237	Fix/test pass overrides (#918 ) * Fix test requiring both test_step and test_end * Add test Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-02-24 22:33:11 -05:00
William Falcon	1015a00506	Clean up dataloader logic (#926 ) * added get dataloaders directly using a getter * deleted decorator * added prepare_data hook * refactored dataloader init * refactored dataloader init * added dataloader reset flag and main loop * added dataloader reset flag and main loop * added dataloader reset flag and main loop * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * made changes * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed bad loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixed error in .fit with loaders * fixes #909 * fixes #909 * bug fix * Fixes #902	2020-02-24 22:23:25 -05:00
William Falcon	d4a31f02e0	Enable TPU support (#868 ) * added tpu docs * added tpu flags * add tpu docs + init training call * amp * amp * amp * amp * optimizer step * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * fix test pkg create (#873) * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Luis Capelo <luiscape@gmail.com> * Fix segmentation example (#876) * removed torchvision model and added custom model * minor fix * Fixed relative imports issue * Fix/typo (#880) * Update greetings.yml * Update greetings.yml * Changelog (#869) * Create CHANGELOG.md * Update CHANGELOG.md * Update CHANGELOG.md * Update PULL_REQUEST_TEMPLATE.md * Update PULL_REQUEST_TEMPLATE.md * Add PR links to Version 0.6.0 in CHANGELOG.md * Add PR links for Unreleased in CHANGELOG.md * Update PULL_REQUEST_TEMPLATE.md * Fixing Function Signatures (#871) * added tpu docs * added tpu flags * add tpu docs + init training call * amp * amp * amp * amp * optimizer step * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added auto data transfer to TPU * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print * added test return and print Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Luis Capelo <luiscape@gmail.com> Co-authored-by: Akshay Kulkarni <akshayk.vnit@gmail.com> Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk> Co-authored-by: Shikhar Chauhan <xssChauhan@users.noreply.github.com>	2020-02-17 16:01:20 -05:00
Vadim Bereznyuk	edd4a87fb0	Refactor callbacks (#776 ) * Refactor callbacks * flake8 * Update docstrings * Simplified callback, protected trainer * .set_trainer() check * update docs * missed super().__ini__() * Updated tests * Use uppercase * refine checkpoint callback tests * Added test_begin() and test_end()	2020-02-16 00:03:05 -05:00
Jeremy Jordan	1cf430f7bc	new feature for profiling training runs (#782 ) * initial implementation * formatting, pass through profiler, docstring * call profiler during training * add initial tests * report stats when training is done * fix formatting * error handling, bugfix in passthroughprofiler * finish documenting profiler arg in Trainer * relax required precision for profiling tests * option to dump cProfiler results to text file * use logging, format with black * include profiler in docs * improved logging and better docs * appease the linter * better summaries, wrapper for iterables * fix typo * allow profiler=True creation * more documentation * add tests for advanced profiler * Update trainer.py * make profilers accessible in pl.utilities * reorg profiler files * change import for profiler tests Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-02-06 22:01:21 -05:00
Vadim Bereznyuk	5035ce5474	Make default tqdm dict overridable (#749 ) * overridable tqdm_dict * Slim down default tqdm_metrics * gpu fix	2020-02-05 06:24:43 -05:00
Mike Clark	deffbaba7f	for #330 , use tqdm.auto in trainer (#752 ) * use tqdm.auto in trainer This will import the ipywidgets version of tqdm if available. This works nicely in notebooks by not filling up the log. In the terminal it will use the same old tqdm. We might also want to consider passing in the tqdm we want as an argument since there may be some edge cases where ipywidgets is available but the interface doesn't support it (e.g. vscode?) or isn't working. In which case people will get a warning message, but may want to configure it themselves. * use `from tqdm.auto` in eval loop * indents	2020-01-26 10:19:09 -05:00
Jirka Borovec	ea59a99426	update org paths & convert logos (#685 ) * fix typos * update org paths * update links from READMe to docs * add svg logo * add svg logo-text * update logos * testing temp paths * prune links from readme * optimize imports * update logo * update paths in README * missing imports	2020-01-20 14:50:31 -05:00
Jirka Borovec	f72e354ee6	fixing TensorBoard (#687 ) * flake8 * fix typo * fix tensorboardlogger drop test_tube dependence * formatting * fix tensorboard & tests * upgrade Tensorboard * test formatting separately * try to fix JIT issue * add tests for 1.4	2020-01-16 07:22:29 -05:00
Vadim Bereznyuk	756c70a4a0	Clearer disable validation logic (#650 ) * Clearer disable validation logic * fix for fast_dev_run * flake8 fix * Test check fix * update error message	2020-01-13 22:31:15 -05:00
Elliot Waite	b492e2b89e	Change nb to num in ABCs, comments, and tqdm logging (#613 ) * Change nb to num in ABCs, comments, and tqdm logging * Fix warnings text * Make warnings one line * Change num to number in comments	2019-12-09 04:40:26 -08:00
Jirka Borovec	1d4b6be17b	rename trainer modules, drop `_mixin` (#571 ) * rename trainer modules, drop _mixin * fix imports	2019-12-04 11:39:14 -05:00

1 2 3

122 Commits