lightning

Commit Graph

Author	SHA1	Message	Date
Jirka Borovec	79d42d83e7	formatting 3/n: PL modules (#5716 ) * cb * log * prof * tune * flake8	2021-02-08 14:28:38 -05:00
Rohit Gupta	cb67e1d0b2	Separate epoch validation from step validation (#5208 ) * Seperate epoch validaton from step validation * update system * test * baked logic in callbacks * unbake logic in callbacks * fix the call for scheduler * use property * pep * correct rebase * gitignore * ref * add tests * fix * add early stopping test * trigger * chlog * rev * 1.3 * log * Apply suggestions from code review Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update pytorch_lightning/trainer/training_loop.py * Update CHANGELOG.md * Apply suggestions from code review Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> (cherry picked from commit `e429f97b67`)	2021-02-08 20:22:39 +01:00
Alan Du	f6dc354349	Throw MisconfigurationError on unknown mode (#5255 ) * Throw MisconfigurationError on unknown mode * Add tests * Add match condition for deprecation message	2021-01-12 02:31:26 -05:00
chaton	5f94900361	[Feat] Cleanup ModelCheckpoint / EarlyStopping by moving logic to LoggerConnector (#5218 ) * [bug-fix] Metric reduction with Logging (#5150) * add test * resolve bug * udpate test * wrongly copy / paste * update test * resolve a second bug Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal> * iupdate * resolve bugs * add test back * correct flake8 * resolve flake8 * update on comments * update tests * add a test * add test * update to Callable Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal>	2021-01-07 10:57:26 -05:00
Rohit Gupta	9cfbf8d609	Disable checkpointing, earlystopping and logging with fast_dev_run (#5277 ) * Disable checkpointing, earlystopping and logger with fast_dev_run * docs * chlog * disable callbacks and enable DummyLogger * add log * use dummy logger method * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> (cherry picked from commit `f740245521`)	2021-01-06 12:57:24 +01:00
chaton	6b19198aae	[bug-fix] Metric reduction with Logging (#5150 ) * add test * resolve bug * udpate test * wrongly copy / paste * update test * resolve a second bug Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal>	2021-01-05 09:58:37 +01:00
Jirka Borovec	0f36525e8f	fix/enable - check F401 (#5201 ) * refactor - check F401 * missed * fix	2020-12-21 10:15:04 +01:00
Jirka Borovec	059eaecbb4	set xxx_AVAILABLE as protected (#5082 ) * sett xxx_AVAILABLE as protected * docs	2020-12-14 20:19:05 +05:30
Rohit Gupta	342a2b6f25	Deprecate auto mode from ModelCheckpoint and EarlyStopping (#4695 ) * remove auto mode from callbacks * chlog * remove auto mode from callbacks * mode * mode * move back * update docs * update docstrings * docstring warning * fix syntax * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * isort * default to 'auto' * syntax Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-12-04 16:11:58 +01:00
Jirka Borovec	442d57f1e9	simplify imports xla / TPU (#4872 ) * xla * tpu * fix * fix * flake8	2020-11-27 00:37:48 +01:00
Dusan Drevicky	2ffad4c89f	Fix info message when EarlyStopping 'mode' not provided [ci skip] (#4282 ) * Fix info message when EarlyStopping 'mode' not provided * fixup! Fix info message when EarlyStopping 'mode' not provided * Apply suggestions from code review Co-authored-by: Jeff Yang <ydcjeff@outlook.com> Co-authored-by: Jeff Yang <ydcjeff@outlook.com>	2020-10-21 23:44:13 +05:30
Jirka Borovec	f37444fa3e	CI: add flake8 (#4239 )	2020-10-19 21:20:17 +01:00
Jirka Borovec	36f0c8a61d	remove deprecated callbacks (#3979 )	2020-10-08 06:37:14 -04:00
William Falcon	048a816be3	added tests for the training epoch end (#3967 )	2020-10-07 22:27:36 -04:00
Jeff Yang	fe5b943965	Callback docs with autosummary (#3908 ) * callback docs with autosummary * do not show private methods * callback base docstring	2020-10-06 17:28:45 -04:00
Jirka Borovec	064ae53d63	nb steps in early stop (#3909 ) * nb steps * if * skip * rev * seed * seed	2020-10-06 15:20:08 -04:00
Lezwon Castelino	69833dad5b	Added check to verify xla device is TPU (#3274 ) * tpu device check * replaced with xmp spawn * Revert "replaced with xmp spawn" This reverts commit 6835380f * replaced all instances of XLA_AVAILABLE * moved inner_f to global scope * made refactors * added changelog * added TPU_AVAILABLE variable * fix codefactor issues * removed form trainer and early stopping * add TORCHXLA_AVAILABLE check * added tests * refactoring * Update pytorch_lightning/utilities/xla_device_utils.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * updated function names * fixed bug * updated CHANGELOG.md * added todo * added type hints * isort and black Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-10-06 19:54:37 +02:00
Adrian Wälchli	cc9781a0ad	Deprecate early_stop_callback Trainer argument (part 2) (#3845 ) * update tests with EarlyStopping default * imports * revert legacy tests * fix test * revert * revert	2020-10-04 17:36:47 -04:00
Adrian Wälchli	1906867fd4	deprecation warning (#3844 )	2020-10-04 13:17:09 -04:00
William Falcon	d9bc95f83e	ref: bug fix with logging val epoch end + monitor (#3812 ) * ref: fix metric err * ref: fix metric err * ref: fix metric err * ref: merge * ref: merge * ref: merge * ref: merge * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix	2020-10-03 12:33:29 -04:00
William Falcon	21cfdf6874	ref: result 1/n (make monitor default to checkpoint_on to simplify re… (#3571 ) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * force crash when max_epochs < epochs in a checkpoint Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>	2020-09-20 22:58:43 -04:00
Carlos Mocholí	580b04b490	Fix ModelCheckpoints name formatting (#3163 ) * Fix ModelCheckpoint's name formatting * Fix failing tests * Add dot to CHECKPOINT_SUFFIX * Set variables to their default values at the end of tests * Fix logic for filepath='' and filename=None. Add test * Fix Windows tests * Fix typo. Remove leading line break and zeroes * Remove CHECKPOINT_SUFFIX * Fix typos. Use appropriate f-string format * Apply suggestions from code review * Fix broken tests after #3320 * Finish changes suggested by Borda * Use explicit test var names * Apply suggestions Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update CHANGELOG * Apply suggestions from code review * for * prepend whitespace in warn msg Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-18 23:09:11 +02:00
Lucas Steinmann	197acd535f	Fix early stopping with training step's return dict (#3347 ) * Fixes the test for early stopping without val step. The expression which checked, if early stopping was triggered, had an off-by-one error and hence was true even if early stopping was not triggered. Furthermore set patience to 0 and max epochs to 10, to ensure loss has enough time to flatten. * Fixes early stopping without val step. The issue has been, that only `early_stop_on` key was checked and not an arbitrary monitor key. * Fixes branch, which checks whether early stopping is done during validation. Before only `val_early_stop_on` was checked. Since arbitrary keys can be used, the set of possible validation keys cannot be exhaustive. Hence this disables "early stopping on_train_epoch_end" via an instance attribute if early stopping was executed in on_validation_epoch_end. Furthermore adds a test, which ensures arbitrary keys work. * Improve check whether eval results are used. Only disable early checking with train results if eval results are actually used. Before they were always disabled in ``on_validation_epoch_end``. Rename and document instance variable, to make it more clear. * Remove wrong documentation on behaviour of early stopping with train result' dict. * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-09-18 23:08:04 +02:00
William Falcon	ef20310873	ref: move specific accelerator code x/n (#3457 ) * ref: organize args x/n * ref: move specific accelerator code x/n * ref: move specific accelerator code x/n * ref: move specific accelerator code x/n	2020-09-11 10:56:21 -04:00
William Falcon	0b5b70d6c9	ref: inner train loop (intermediate step) 17/n (#3376 ) * ref: inner train loop (intermediate step) 17/n * ref: inner train loop (intermediate step) 17/n * ref: inner train loop (intermediate step) 17/n	2020-09-07 09:31:42 -04:00
Lezwon Castelino	3910ad0330	bugfix/3185 transpose (#3252 ) * change t() to transpose() as xla devices do not support .t() on 1-dim tensor * detach tensor before copying * Revert "detach tensor before copying" This reverts commit `37cc7bbe` * changed dims * added test_result_obj_on_tpu * detach before copying * detach before copying * detach before copying * replace torch.cat with sum	2020-09-01 09:17:52 -04:00
Jeremy Jordan	a5d1176cf6	callback method for on_save_checkpoint (#2501 ) * initial draft * fix test * Update pytorch_lightning/trainer/callback_hook.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * fix tests * remove old code * untested upgrade script * document limitations * clean up and add tests * Update pytorch_lightning/trainer/training_io.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * reflect PR comments * fix formatting * Update docs/source/callbacks.rst * clarify docs * revert change for loading checkpoints * small edits Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-08-28 16:50:52 +02:00
William Falcon	f3c63f7746	tests to ensure correct dataloader calls (#3221 ) * tests to ensure correct dataloading interval and sequence * tests to ensure correct dataloading interval and sequence * tests to ensure correct dataloading interval and sequence * tests to ensure correct dataloading interval and sequence * tests to ensure correct dataloading interval and sequence	2020-08-27 09:49:46 -04:00
Duc Pham	4d98419bb8	Fix potential typo in early stopping `monitor` keys (#3213 ) * Fix typo * ref: group prepare data hook (6) (#3212) * group prepare data hook * group prepare data hook * group prepare data hook * group prepare data hook * group prepare data hook * group prepare data hook * group prepare data hook * Fix typo Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-08-26 22:21:30 -04:00
William Falcon	a1705441a9	ref: remove _evaluate fx (#3197 ) * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate	2020-08-26 12:28:14 -04:00
William Falcon	11b86e44b5	remove last of bad result obj warning (#3073 ) * fixed bad warn for result obj * fixed bad warn for result obj	2020-08-20 09:06:29 -04:00
William Falcon	f43028f3ae	added copyright notices (#3062 )	2020-08-19 22:03:22 -04:00
William Falcon	51de6802ed	added warning when changing monitor and using results obj (#3014 ) * added warning when changing monitor and using results obj * added warning when changing monitor and using results obj * added warning when changing monitor and using results obj	2020-08-17 10:29:28 -04:00
William Falcon	6c5a0a172f	Resultd (#2947 ) * updated docs	2020-08-13 09:58:05 -04:00
William Falcon	d13e5c9e53	document lightiningmodule better (#2920 ) * updated docs	2020-08-11 19:39:43 -04:00
Jirka Borovec	b7d72706c3	clean imports (#2867 ) * clean imports * miss	2020-08-08 00:33:51 +02:00
siahuat0727	b9381c3258	Fix docs typo (#2747 )	2020-07-29 07:11:49 -04:00
William Falcon	62ce00f96c	EvalResult support for val loop (PR 3/5) (#2651 ) * add EvalResult to support to val/test loops	2020-07-22 13:53:10 -04:00
William Falcon	6d10ac2ac8	Structured results (train loop only. val loop separate PR) (PR 2/5) (#2615 ) * r * r * r * patched optimizer closure with sr * patched optimizer closure with sr * patched optimizer closure with sr * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added autoreduce for train step * added auto reduce on train * added auto reduce on train * added auto reduce on train * added auto reduce on train * added auto reduce on train * added auto reduce on train * added hooks * added hooks * added hooks * added hooks * added hooks * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * cache * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * Update pytorch_lightning/callbacks/early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py * Update pytorch_lightning/core/step_result.py * finished tests for structured results on train epoch * finished tests for structured results on train epoch * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * simple * finished tests for structured results on train epoch * simple * simple * revert * finished tests for structured results on train epoch * finished tests for structured results on train epoch * Update tests/base/deterministic_model.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * finished tests for structured results on train epoch * docstring typos * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * Update pytorch_lightning/core/step_result.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update pytorch_lightning/overrides/data_parallel.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka <jirka@pytorchlightning.ai> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>	2020-07-20 19:00:20 -04:00
William Falcon	e5a979990e	Hang (#2488 ) * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test	2020-07-03 15:16:45 -04:00
William Falcon	020c332ae9	Clean up (#2467 ) * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test	2020-07-03 00:38:29 -04:00
Adrian Wälchli	25ee51bc57	Continue Jeremy's early stopping PR #1504 (#2391 ) * add state_dict for early stopping * move best attr after monitor_op defined * improve early stopping and model checkpoint callbacks * fix formatting * fix attr init order * clean up setting of default_root_dir attr * logger needs default root dir set first * reorg trainer init * remove direct references to checkpoint callback * more fixes * more bugfixes * run callbacks at epoch end * update tests to use on epoch end * PR cleanup * address failing tests * refactor for homogeneity * fix merge conflict * separate tests * tests for early stopping bug regressions * small fixes * revert model checkpoint change * typo fix * fix tests * update train loop * cannot pass an int as default_save_path * refactor log message * fix test case * appease the linter * fix some doctests * move config to callback * fixes from rebase * fixes from rebase * chlog * docs * reformat * formatting * fix * fix * fixes from rebase * add new test for patience * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/callbacks/test_early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * fix formatting * remove enable_early_stop attribute * add state_dict for early stopping * move best attr after monitor_op defined * improve early stopping and model checkpoint callbacks * fix formatting * fix attr init order * clean up setting of default_root_dir attr * logger needs default root dir set first * reorg trainer init * remove direct references to checkpoint callback * more fixes * more bugfixes * run callbacks at epoch end * update tests to use on epoch end * PR cleanup * address failing tests * refactor for homogeneity * fix merge conflict * separate tests * tests for early stopping bug regressions * small fixes * revert model checkpoint change * typo fix * fix tests * update train loop * fix test case * appease the linter * fix some doctests * move config to callback * fixes from rebase * fixes from rebase * chlog * docs * reformat * formatting * fix * fix * fixes from rebase * add new test for patience * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/callbacks/test_early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * fix formatting * remove enable_early_stop attribute * fix test with new epoch indexing * fix progress bar totals * fix off by one error (see #2289) epoch starts at 0 now * added missing imports * fix hpc_save folderpath * fix formatting * fix tests * small fixes from a rebase * fix * tmpdir * tmpdir * tmpdir * wandb * fix merge conflict * add back evaluation after training * test_resume_early_stopping_from_checkpoint TODO * undo the horovod check * update changelog * remove a duplicate test from merge error * try fix dp_resume test * add the logger fix from master * try remove default_root_dir * try mocking numpy * try import numpy in docs test * fix wandb test * pep 8 fix * skip if no amp * dont mock when doctesting * install extra * fix the resume ES test * undo conf.py changes * revert remove comet pickle from test * Update CHANGELOG.md Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update weights_loading.rst * Update weights_loading.rst * Update weights_loading.rst * renamed flag * renamed flag * revert the None check in logger experiment name/version * add the old comments * _experiment * test chckpointing on DDP * skip the ddp test on windows * cloudpickle * renamed flag * renamed flag * parentheses for clarity * apply suggestion max epochs Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jeremy Jordan <jtjordan@ncsu.edu> Co-authored-by: Jirka <jirka@pytorchlightning.ai> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-06-28 21:36:46 -04:00
William Falcon	479ab49d03	temporarily fixes early stopping bug (#2119 ) * fixes early stopping bug * fixes early stopping bug * fixes early stopping bug * fixes early stopping bug * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * fixe docs * added test	2020-06-08 19:28:26 -04:00
William Falcon	0be530a427	Revert "Fixes EarlyStopping With Precision=16 (#1996 )" (#2032 ) This reverts commit `bf39cb26c5`.	2020-05-31 15:20:18 -04:00
authman	bf39cb26c5	Fixes EarlyStopping With Precision=16 (#1996 ) * Patch for issue 1815, which will allow EarlyStopping to work on precision=16 * Added a whitespace to the end of the line so CICD can rerun. No reason for the latest macos test to have been cancelled. * Format.	2020-05-31 15:02:19 -04:00
Federico Baldassarre	65b4352930	early stopping checks on_validation_end (#1458 ) * Fixes PyTorchLightning/pytorch-lightning#490 `EarlyStopping` should check the metric of interest `on_validation_end` rather than `on_epoch_end`. In a normal scenario, this does not cause a problem, but in combination with `check_val_every_n_epoch>1` in the `Trainer` it results in a warning or in a `RuntimeError` depending on `strict`. * Highlighted that ES callback runs on val epochs in docstring * Updated EarlyStopping in rst doc * Update early_stopping.py * Update early_stopping.rst * Update early_stopping.rst * Update early_stopping.rst * Update early_stopping.rst * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update docs/source/early_stopping.rst * fix doctest indentation warning * Train loop calls early_stop.on_validation_end * chlog Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka <jirka@pytorchlightning.ai>	2020-05-25 17:33:00 +00:00
Jeremy Jordan	fc7f5919b5	improve pickle tests for callbacks (#1717 ) * improve pickle tests for callbacks * set mode dict as a class attr	2020-05-05 14:08:54 -04:00
William Falcon	a24c88ab08	ddp pickle	2020-04-27 08:19:19 -04:00
William Falcon	9020cf91b5	fixed warning	2020-04-26 12:53:42 -04:00
William Falcon	ae2e14e3ed	fixed memory leak from opt return (#1528 ) * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return * fixed memory leak from opt return	2020-04-19 16:41:54 -04:00

1 2

58 Commits