lightning

Commit Graph

Author	SHA1	Message	Date
William Falcon	b34c7add23	Fixes #3668 , #3887 as a bonus (#3888 ) * Fixes #3668, #3887 as a bonus * Fixes #3668, #3887 as a bonus	2020-10-05 21:30:41 -04:00
William Falcon	b014223f72	Fixes #2678 - enables training_step to return None (#3862 ) * Fixes #2678 - enables training_step to return None * Fixes #2678 - enables training_step to return None	2020-10-05 07:33:46 -04:00
William Falcon	d787208e76	Fixes #2792 (#3857 )	2020-10-04 23:25:02 -04:00
William Falcon	f58c760409	Fixes #2551 (#3858 )	2020-10-04 23:02:35 -04:00
William Falcon	97e62b38cf	Fixed #2143 and many more :) (#3855 )	2020-10-04 22:18:49 -04:00
William Falcon	d9656d166c	fixed model checkpoint frequency (#3852 ) * fixed model checkpoint frequency * fixed model checkpoint frequency * fixed model checkpoint frequency * fixed model checkpoint frequency * merged	2020-10-04 21:49:20 -04:00
William Falcon	2bca89a752	added tbptt test for logging (#3850 ) * added tbptt test for logging * added tbptt test for logging	2020-10-04 19:38:42 -04:00
William Falcon	00f0d19a61	fixes #3798 (#3849 ) * fix #3798 * added tbptt test for logging	2020-10-04 19:36:51 -04:00
Carlos Mocholí	89cc12311f	Fix tbptt_reduce_fx when non-floating tensors are logged (#3796 ) * Add failing test * force all tbptt vals to be floats for reduce Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-10-04 17:10:25 -04:00
Rohit Gupta	d3696052cf	Add back sanity checks (#3846 ) * Add back sanity checks * pep	2020-10-04 17:05:26 -04:00
William Falcon	1aa9d39506	Eval epoch can now log independently (#3843 ) * ref: routed epoch outputs to logger * ref: routed epoch outputs to logger * ref: routed epoch outputs to logger * ref: routed epoch outputs to logger	2020-10-04 13:36:35 -04:00
Rohit Gupta	a628d181ee	Fix val_progress_bar total with num_sanity_val_steps (#3751 ) * Fix val_progress_bar total with num_sanity_val_steps * chlog * Fix val_progress_bar total with num_sanity_val_steps * move test * replaced with sanity flag and suggestions	2020-10-04 08:32:18 -04:00
William Falcon	66aef10239	verified epoch logging (#3830 ) * ref: fix epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging	2020-10-03 21:17:24 -04:00
William Falcon	3903cf63c6	ref: training flag tests (val_check_interval) (#3825 ) * added test_val_check_interval tests * added test_val_check_interval tests * added test_val_check_interval tests	2020-10-03 14:05:01 -04:00
William Falcon	d9bc95f83e	ref: bug fix with logging val epoch end + monitor (#3812 ) * ref: fix metric err * ref: fix metric err * ref: fix metric err * ref: merge * ref: merge * ref: merge * ref: merge * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix	2020-10-03 12:33:29 -04:00
GimmickNG	e4e60e9b82	Add datamodule parameter to lr_find() (#3425 ) * Add datamodule parameter to lr_find() * Fixed missing import * Move datamodule parameter to end * Add datamodule parameter test with auto_lr_find * Change test for datamodule parameter * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Fix lr_find documentation Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * formatting * Add description to datamodule param in lr_find * pep8: remove trailing whitespace on line 105 * added changelog Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> Co-authored-by: Nicki Skafte <nugginea@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-10-01 10:33:12 +02:00
Teddy Koker	5ec00ccd28	Added gradient clip test for native AMP (#3754 ) * added gradient clip test for fp16 * pep8	2020-10-01 01:36:34 -04:00
Adrian Wälchli	c73032e39d	Make ModelCheckpoint(save_top_k=-1) track the best models (#3735 ) * fix topk=-1 tracking best * update test * clean up * add changelog * enable loading best topk in trainer.test() * make trivial * return right away * make windows test path happy	2020-09-30 08:34:02 -04:00
Adrian Wälchli	9405c880af	log/save_interval based on global step (#3667 ) * log interval based on global step * test * test * test * test * pep * pep * added changelog * pep * merge * remove unused arg	2020-09-30 12:26:27 +02:00
William Falcon	b3be8022bd	tests for val step flow and logging (#3731 ) * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end * ref: test log dict * ref: test log dict * ref: test log dict * ref: test log dict	2020-09-29 22:12:56 -04:00
William Falcon	c14928a72a	ref: test val flow steps (#3723 ) * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end	2020-09-29 11:42:38 -04:00
William Falcon	f42ea303c9	ref: enable self.log for eval loop metrics (#3715 ) * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end	2020-09-29 02:00:28 -04:00
Rohit Gupta	783750547d	disable optimizers setup during testing (#3059 ) * disable configure_optimizers during testing * minor changes * hvd and ddp * fix precision during testing * fix ddp * fix amp * fix cpu * update dp * simplify optimizers * add test * codefactor * ref optimizer setup * chlog * suggestions * isort * rebased with master	2020-09-29 01:09:04 +02:00
William Falcon	4d5c0fa1bc	ref: separate flow vs log tests (#3704 )	2020-09-28 12:01:52 -04:00
William Falcon	cdd7266cd8	ref: enable self.log from val step (#3701 ) * .log in eval * ref * ref: enable self.log in val step	2020-09-28 10:49:07 -04:00
William Falcon	2ecaa2a8be	ref: (2/n) fix no log in epoch end (#3699 )	2020-09-28 08:25:44 -04:00
William Falcon	ddd11075bd	[WIP] ref: deprecated results obj, added support for simpler comms (1/n) (#3681 ) * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * fix global step err * fix global step err * fix global step err * fix global step err * fix global step err * fix typing err * fix str * fix typing err	2020-09-27 23:19:46 -04:00
William Falcon	ff2bab0996	ref: (results 1/n) enable tracking original metric when step and epoch are both true (#3685 ) * enable tracking original metric when step and epoch are both true	2020-09-27 22:08:31 -04:00
Adrian Wälchli	f37e9e8a83	Fix global step increment on training_epoch_end (#3673 ) * fix * fix global step err * fix global step err * fix global step err * fix global step err * fix global step err * fix global step err Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-09-27 20:19:51 -04:00
William Falcon	d79bce1dff	enable None model checkpoint default (#3669 ) * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default	2020-09-26 23:14:04 -04:00
William Falcon	c591013708	enable any logged metric to be accessible in callbacks (#3598 ) * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * clarify forward * clarify forward * clarify forward * clarify forward	2020-09-22 18:00:23 -04:00
Nicki Skafte	88e6b29bba	faster tests (#3604 )	2020-09-22 07:37:34 -04:00
William Falcon	21cfdf6874	ref: result 1/n (make monitor default to checkpoint_on to simplify re… (#3571 ) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * force crash when max_epochs < epochs in a checkpoint Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>	2020-09-20 22:58:43 -04:00
William Falcon	9acee67c31	fixes 3549 (#3564 )	2020-09-19 20:00:50 -04:00
Carlos Mocholí	580b04b490	Fix ModelCheckpoints name formatting (#3163 ) * Fix ModelCheckpoint's name formatting * Fix failing tests * Add dot to CHECKPOINT_SUFFIX * Set variables to their default values at the end of tests * Fix logic for filepath='' and filename=None. Add test * Fix Windows tests * Fix typo. Remove leading line break and zeroes * Remove CHECKPOINT_SUFFIX * Fix typos. Use appropriate f-string format * Apply suggestions from code review * Fix broken tests after #3320 * Finish changes suggested by Borda * Use explicit test var names * Apply suggestions Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update CHANGELOG * Apply suggestions from code review * for * prepend whitespace in warn msg Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-18 23:09:11 +02:00
Abe Botros	76c4afb840	Fix IoU score for classes not present in target or pred (#3098 ) * Fix IoU score for classes not present in target or pred Fixes #3097 - Allow configurable not_present_score for IoU for classes not present in target or pred. Defaults to 1.0. - Also allow passing `num_classes` parameter through from iou metric class down to its underlying functional iou call. * Changelog: move IoU not-present score fix to [unreleased] * IoU: avoid recomputing class presence in target and pred Use already-computed support, true positives, and false positives to determine if a class is not present in either target or pred. * Test IoU against sklearn jaccard_score Also add TODO to test our IoU's not_present_score against sklearn's jaccard_score's zero_division when it beecomes available. * IoU: remove_bg -> ignore_index Fixes #2736 - Rename IoU metric argument from `remove_bg` -> `ignore_index`. - Accept an optional int class index to ignore, instead of a bool and instead of always assuming the background class has index 0. - If given, ignore the class index when computing the IoU output, regardless of reduction method. * Improve documentation for IoU not_present_score * Update default IoU not_present_score to 0.0 * Add note about IoU division by zero * Rename IoU not_present_score -> absent_score * Update IoU absent score changelog wording * Condense IoU absent_score argument docstring * Remove unnecessary IoU ignore_index comment * docstrings * isort * flake8 * Fix test of IoU against sklearn jaccard Use macro instead of micro averaging in sklearn's jaccard score, to match multi-class IoU, which conventionally takes per-class scores before averaging. Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2020-09-17 10:37:49 +02:00
Phil	b5dc6998ae	Disable train dataloader shuffle when overfit_batches is active. (#3501 ) * Disable train dataloader shuffle when overfit_batches is active. * pep8 Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-09-15 05:07:27 -04:00
William Falcon	1d7c615d82	cleaning up stale logger tests + flake8 (#3490 ) * cleaning up stale logger tests * cleaning up stale logger tests * cleaning up stale logger tests * cleaning up stale logger tests * cleaning up stale logger tests * cleaning up stale logger tests	2020-09-14 00:06:48 -04:00
William Falcon	cd16aa9854	ref: checkpoint connector methods 4/n (#3474 ) * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n	2020-09-12 08:42:27 -04:00
Rohit Gupta	a1ea681c47	Fix batch_outputs with optimizer frequencies (#3229 ) * Fix batch_outputs with optimizers frequencies * optimizers * fix batch_outputs with optimizer frequencies * clean test * suggestion Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * chlog * failing doctest * failing doctest * update doctest * chlog Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-10 23:01:20 +02:00
William Falcon	5abf7d9123	ref: move lr_finder (#3434 ) * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder	2020-09-09 22:12:27 -04:00
William Falcon	b36c5e86d0	ref: trainer argparse 1/n (#3421 ) * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n	2020-09-09 12:31:17 -04:00
Adrian Wälchli	e245065fbc	limit auto scaling batch size to the size of the training dataset (#3271 ) * fix * fix and test * fix merge error * test for max dataset size * changelog * update docs * fix merge * unused imports * imports	2020-09-09 10:51:43 +02:00
William Falcon	ff5f099cb7	ref: remove inner train loop 1/n (#3397 ) * ref: remove inner train loop 1/n * ref: remove inner train loop 1/n	2020-09-08 12:05:00 -04:00
William Falcon	d438ad8a8d	ensure calling test multiple times does not change results (#3391 )	2020-09-07 22:25:12 -04:00
William Falcon	b76d9e5dd5	Refa22 (#3388 ) * ref: inner train loop (intermediate step) 20/n * ref: inner train loop (intermediate step) 21/n * ref: inner train loop (intermediate step) 21/n * ref: inner train loop (intermediate step) 21/n * ref: inner train loop (intermediate step) 21/n * ref: inner train loop (intermediate step) 21/n	2020-09-07 16:45:31 -04:00
William Falcon	0b5b70d6c9	ref: inner train loop (intermediate step) 17/n (#3376 ) * ref: inner train loop (intermediate step) 17/n * ref: inner train loop (intermediate step) 17/n * ref: inner train loop (intermediate step) 17/n	2020-09-07 09:31:42 -04:00
William Falcon	69e3f904df	ref: inner train loop (intermediate step) 16/n (#3375 ) * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n	2020-09-06 21:57:20 -04:00
William Falcon	85421466ab	ref: inner train loop (intermediate step) 10/n (#3369 )	2020-09-06 08:59:58 -04:00
Adrian Wälchli	48c22c8bad	update batch size in DataModule when auto scaling batch size (#3266 ) * fix datamodule hasattr * fix patch check * fix setattr * update docs * revert patch fix * changelog * fix datamodule passed in as fit arg * docs * set datamodule batch size in lightning_setattr * fix merge * check with has_attr * access datamodule via trainer * pass fit args down to tuner * docs * fix typos in docs Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2020-09-03 22:07:49 +02:00

1 2 3 4 5

226 Commits