lightning

Commit Graph

Author	SHA1	Message	Date
Rohit Gupta	783750547d	disable optimizers setup during testing (#3059 ) * disable configure_optimizers during testing * minor changes * hvd and ddp * fix precision during testing * fix ddp * fix amp * fix cpu * update dp * simplify optimizers * add test * codefactor * ref optimizer setup * chlog * suggestions * isort * rebased with master	2020-09-29 01:09:04 +02:00
William Falcon	4d5c0fa1bc	ref: separate flow vs log tests (#3704 )	2020-09-28 12:01:52 -04:00
William Falcon	cdd7266cd8	ref: enable self.log from val step (#3701 ) * .log in eval * ref * ref: enable self.log in val step	2020-09-28 10:49:07 -04:00
William Falcon	2ecaa2a8be	ref: (2/n) fix no log in epoch end (#3699 )	2020-09-28 08:25:44 -04:00
William Falcon	ddd11075bd	[WIP] ref: deprecated results obj, added support for simpler comms (1/n) (#3681 ) * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * fix global step err * fix global step err * fix global step err * fix global step err * fix global step err * fix typing err * fix str * fix typing err	2020-09-27 23:19:46 -04:00
William Falcon	ff2bab0996	ref: (results 1/n) enable tracking original metric when step and epoch are both true (#3685 ) * enable tracking original metric when step and epoch are both true	2020-09-27 22:08:31 -04:00
William Falcon	931995b55b	remove flake 8 (#3687 )	2020-09-27 20:40:02 -04:00
Adrian Wälchli	f37e9e8a83	Fix global step increment on training_epoch_end (#3673 ) * fix * fix global step err * fix global step err * fix global step err * fix global step err * fix global step err * fix global step err Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-09-27 20:19:51 -04:00
Adrian Wälchli	d15fd751c7	change default save_top_k, save_last to None (#3680 ) * topk default * fix test that doesn't have best available * remove print * #3680 changes * fix backward * temp revert te * add warning by carmocca * format docstring for test * specify monitor in ES test with top k * improve docstring for save_last * remove commented lines * revert passing model to test * undo regex mistake * changelog * fix test covering case monitor=None and savetopk=-1 * docstring * fix test for saving all checkpoints * don't save checkpoints for save_top_k=0 * add test for savetopk=0 Co-authored-by @carmocca Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2020-09-27 20:05:02 -04:00
ananthsub	94c79bb3ba	Add a reference to the Trainer on the LightningDataModule (#3684 ) * Split out changes from #3563 to make that PR easier to review. This formats the file according to the Black formatter * Store a reference to the trainer on the datamodule Fixes #3682 * Update data_connector.py * Update data_connector.py * Update test_datamodules.py	2020-09-27 19:48:01 -04:00
Pariente Manuel	3d76f604bd	Add ModelCheckpoint.to_yaml method (#3048 ) * Add ModelCheckpoint.to_json() * Add ModelCheckpoint.to_json() test * Fix W292: Add new line at end of file * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * Fixed tests * Update pytorch_lightning/callbacks/model_checkpoint.py * Apply suggestions from code review * fix test Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2020-09-27 14:39:40 +02:00
William Falcon	d79bce1dff	enable None model checkpoint default (#3669 ) * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default	2020-09-26 23:14:04 -04:00
Adrian Wälchli	3ff5327e83	Mocking loggers (part 1, wandb) (#3596 ) * mocking for wandb * remove wandb import in amp test * mock loggers in sphinx * check tests * Update extra.txt * setup * dev * min * revert Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-09-25 16:00:02 +02:00
Carlos Mocholí	e70aea7642	Allow ModelCheckpoint monitor to be None (#3633 ) * Fix ModelCheckpoint period * Test for less epochs	2020-09-25 15:54:04 +02:00
Carlos Mocholí	ed12e422a4	Fix incorrect "Saving latest checkpoint" warning (#3588 ) * Fix incorrect "Saving latest checkpoint" warning * Replace warning with info. Run PyCharm's optimize imports * Remove unused class variable. Refactor logic. Improve test * Fix De Morgan's	2020-09-25 14:18:06 +02:00
Antoine Broyelle	17c8c95fbc	Wrap prepare_data and setup only once inside DataModule (#3654 ) Fix #3652	2020-09-25 07:09:50 -04:00
Carlos Mocholí	908382f196	Split GPUStatsMonitor function (#3644 ) * Split function * Add docstrings * Add typing annotations * Minor refactor * Make static to add a test	2020-09-25 07:30:30 +02:00
Jirka Borovec	aa52c930f4	test examples (#3643 ) * test examples * testing * testing * typo * req * exception Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-24 17:33:11 +02:00
Adrian Wälchli	3affa0e49a	use tmpdir in tests when writing predictions to disk (#3561 ) * save to tmpdir * path	2020-09-23 07:44:15 -04:00
William Falcon	031274c25d	fix dp issues + update examples and test examples (#3618 ) * fix dp * fix dp * fix dp * fix dp * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples * fix examples	2020-09-23 00:19:46 -04:00
William Falcon	c591013708	enable any logged metric to be accessible in callbacks (#3598 ) * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * enable any logged or written metric to be accessible in callbacks * clarify forward * clarify forward * clarify forward * clarify forward	2020-09-22 18:00:23 -04:00
Nicki Skafte	88e6b29bba	faster tests (#3604 )	2020-09-22 07:37:34 -04:00
Carlos Mocholí	1223cdbaa1	Add missing line. Add a test (#3594 )	2020-09-21 22:17:51 -04:00
Nicki Skafte	b1347c956a	[Metrics] AUROC error on multilabel + improved testing (#3350 ) * error on multilabel * fix tests * fix pep8 * changelog * update doc test * fix doctest * fix doctest * update from suggestion * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update test_classification.py * Update test_classification.py * retrigger test * 'pep8 Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-21 11:46:48 +02:00
William Falcon	21cfdf6874	ref: result 1/n (make monitor default to checkpoint_on to simplify re… (#3571 ) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> * ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax) * force crash when max_epochs < epochs in a checkpoint Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>	2020-09-20 22:58:43 -04:00
William Falcon	277538970d	force crash when max_epochs < epochs in a checkpoint (#3580 ) * force crash when max_epochs < epochs in a checkpoint * force crash when max_epochs < epochs in a checkpoint	2020-09-20 22:04:22 -04:00
William Falcon	9acee67c31	fixes 3549 (#3564 )	2020-09-19 20:00:50 -04:00
Rohit Gupta	07b857769a	Allow kwargs in Wandb & Neptune + kwargs docstring (#3475 ) * Allow kwargs in WandbLogger * isort * kwargs docstring * typo * kwargs for other loggers * pep and isort * formatting * fix failing test Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-19 18:51:43 +02:00
Jirka Borovec	8eb77cd06a	drop v0.10 deprecated (#3454 ) * drop v0.10 deprecated * import * missed	2020-09-19 11:47:26 -04:00
Boris Feld	e2af4f120e	Improve Comet Logger pickled behavior (#2553 ) * Improve Comet Logger pickled behavior * Delay the creation of the actual experiment object for as long as we can. * Save the experiment id in case an Experiment object is created so we can continue the same experiment in the sub-processes. * Run pre-commit on the comet file. * Handle review comment Make most Comet Logger attribute protected as they might not reflect the final Experiment attributes. Also fix the typo in the test name. * Ensure that CometLogger.name and CometLogger.version always returns str * Add new test for CometLogger.version behavior * Add new tests for CometLogger.name and CometLogger.version * Apply review suggestions * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Remove extraneous comments in Comet logger tests * Fix lint issues * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-18 23:26:29 +02:00
Carlos Mocholí	580b04b490	Fix ModelCheckpoints name formatting (#3163 ) * Fix ModelCheckpoint's name formatting * Fix failing tests * Add dot to CHECKPOINT_SUFFIX * Set variables to their default values at the end of tests * Fix logic for filepath='' and filename=None. Add test * Fix Windows tests * Fix typo. Remove leading line break and zeroes * Remove CHECKPOINT_SUFFIX * Fix typos. Use appropriate f-string format * Apply suggestions from code review * Fix broken tests after #3320 * Finish changes suggested by Borda * Use explicit test var names * Apply suggestions Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Apply suggestions Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update CHANGELOG * Apply suggestions from code review * for * prepend whitespace in warn msg Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-18 23:09:11 +02:00
Lucas Steinmann	197acd535f	Fix early stopping with training step's return dict (#3347 ) * Fixes the test for early stopping without val step. The expression which checked, if early stopping was triggered, had an off-by-one error and hence was true even if early stopping was not triggered. Furthermore set patience to 0 and max epochs to 10, to ensure loss has enough time to flatten. * Fixes early stopping without val step. The issue has been, that only `early_stop_on` key was checked and not an arbitrary monitor key. * Fixes branch, which checks whether early stopping is done during validation. Before only `val_early_stop_on` was checked. Since arbitrary keys can be used, the set of possible validation keys cannot be exhaustive. Hence this disables "early stopping on_train_epoch_end" via an instance attribute if early stopping was executed in on_validation_epoch_end. Furthermore adds a test, which ensures arbitrary keys work. * Improve check whether eval results are used. Only disable early checking with train results if eval results are actually used. Before they were always disabled in ``on_validation_epoch_end``. Rename and document instance variable, to make it more clear. * Remove wrong documentation on behaviour of early stopping with train result' dict. * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-09-18 23:08:04 +02:00
Jirka Borovec	7b64472ced	fix lib paths after Wandb 0.10 (#3520 ) * try * try * drop 0.20 * drop 0.19.5 * -U * Fixed Horovod in CI due to wandb==0.10.0 sys.path modifications (#3525) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * format * wb freeze * types Co-authored-by: Travis Addair <taddair@uber.com>	2020-09-17 08:37:49 -04:00
Abe Botros	76c4afb840	Fix IoU score for classes not present in target or pred (#3098 ) * Fix IoU score for classes not present in target or pred Fixes #3097 - Allow configurable not_present_score for IoU for classes not present in target or pred. Defaults to 1.0. - Also allow passing `num_classes` parameter through from iou metric class down to its underlying functional iou call. * Changelog: move IoU not-present score fix to [unreleased] * IoU: avoid recomputing class presence in target and pred Use already-computed support, true positives, and false positives to determine if a class is not present in either target or pred. * Test IoU against sklearn jaccard_score Also add TODO to test our IoU's not_present_score against sklearn's jaccard_score's zero_division when it beecomes available. * IoU: remove_bg -> ignore_index Fixes #2736 - Rename IoU metric argument from `remove_bg` -> `ignore_index`. - Accept an optional int class index to ignore, instead of a bool and instead of always assuming the background class has index 0. - If given, ignore the class index when computing the IoU output, regardless of reduction method. * Improve documentation for IoU not_present_score * Update default IoU not_present_score to 0.0 * Add note about IoU division by zero * Rename IoU not_present_score -> absent_score * Update IoU absent score changelog wording * Condense IoU absent_score argument docstring * Remove unnecessary IoU ignore_index comment * docstrings * isort * flake8 * Fix test of IoU against sklearn jaccard Use macro instead of micro averaging in sklearn's jaccard score, to match multi-class IoU, which conventionally takes per-class scores before averaging. Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2020-09-17 10:37:49 +02:00
Jirka Borovec	c64520e658	fix tensorboard version (#3132 ) * tensorboard version * WIP test tb hparams logs (#3040) * optional * req * tensorboard>=2.2.0 * data * data * TB Co-authored-by: Rosario Scalise <rosario@cs.washington.edu>	2020-09-15 23:48:48 +02:00
Adrian Wälchli	4ed96b2eb4	fix gradient norm tracking for row_log_interval > 1 (#3489 ) * fix + test * changelog * Apply suggestions from code review Co-authored-by: Tim Chard <timchard@hotmail.com> * improve test Co-authored-by: Tim Chard <timchard@hotmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>	2020-09-15 18:41:27 +02:00
Nicki Skafte	28af34bc51	[Metrics] Class reduction similar to sklearn (#3322 ) * new class reduce interface * update docs * pep8 * update_class_metrics * fix doctest * changelog * fix docs * fix codefactor * fix codefactor * formatting * fix typo * fix typo * typo pr -> per * update from suggestion * fix error * Apply suggestions from code review * Update CHANGELOG.md * formatting * timeouts * docstring formatting for reg metrics * pep * flake8 * revert workflow changes * suggestions Co-authored-by: Nicki Skafte <nugginea@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai> Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2020-09-15 14:36:14 +02:00
Alexander	5732a56560	Pass epoch argument to Comet Logger (#3438 ) * Pass epoch argument * Copy epoch instead of inplace pop * Remove whitespace * Add test for epoch logging * add docstring Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-15 14:30:42 +02:00
Phil	b5dc6998ae	Disable train dataloader shuffle when overfit_batches is active. (#3501 ) * Disable train dataloader shuffle when overfit_batches is active. * pep8 Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-09-15 05:07:27 -04:00
Justus Schock	4dc4c8cfa5	Metric aggregation (#3321 ) * metric aggregation * metric aggregation * add at_least_1d * fix output formatting * add metric tests * add missing test case * remove reduce_op frm metric classes * fix reduce_op stuff * start test fixing * fix tests due to aggregation * fix faulty import * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * remove reduce_op docstrings * add compute * remove import * remove collection metric * update base class * update tests * Update metric.py * Update metric.py * Apply suggestions from code review * change default aggregate Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>	2020-09-14 07:23:11 -04:00
Cookie_thief	a552d4a2d5	fix normalize mode at confusion matrix (replace nans with zeros) (#3465 ) * replace nans to 0 at conf. matrix & update tests * cm.isnan() -> torch.isnan(cm) * fix row-wise division while normalize * update tests * pep8 fix * Update tests/metrics/test_classification.py add comment to test Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Update tests/metrics/functional/test_classification.py Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Update pytorch_lightning/metrics/functional/classification.py Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * final update Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>	2020-09-14 10:05:51 +02:00
William Falcon	1d7c615d82	cleaning up stale logger tests + flake8 (#3490 ) * cleaning up stale logger tests * cleaning up stale logger tests * cleaning up stale logger tests * cleaning up stale logger tests * cleaning up stale logger tests * cleaning up stale logger tests	2020-09-14 00:06:48 -04:00
William Falcon	59d8472548	ref: slurm connector 1/n (#3476 ) * ref: slurm connector 1/n * ref: slurm connector 1/n * ref: slurm connector 1/n * ref: slurm connector 1/n	2020-09-12 11:07:15 -04:00
William Falcon	cd16aa9854	ref: checkpoint connector methods 4/n (#3474 ) * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n	2020-09-12 08:42:27 -04:00
William Falcon	de99222834	ref: accelerator connector methods x/n (#3469 ) * ref: accelerator connector methods x/n * ref: accelerator connector methods x/n	2020-09-11 21:52:22 -04:00
ananthsub	d1d48e2ea1	Fix trivial comparison in model checkpoint test (#3464 ) We were comparing keys across the same checkpoint dict instead of ckpt_last vs ckpt_last_epoch All other changes here are formatting	2020-09-11 20:50:46 +02:00
Adrian Wälchli	bd5f53c519	implement fix and test (#3459 )	2020-09-11 10:55:58 -04:00
Nicki Skafte	93cf6d0054	[Metrics] class based embedding similarity + tests (#3358 ) * embedding similarity class + test * fix tests * fix pep8 * add docs * noindex * Update docs/source/metrics.rst * Update pytorch_lightning/metrics/self_supervised.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/metrics/self_supervised.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * suggestions * changes to init * move __all__ * fix imports * Apply suggestions from code review * assert typo * change import Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Nicki Skafte <nugginea@gmail.com>	2020-09-11 12:11:50 +02:00
Cookie_thief	d05d4c78e1	add num_classes argument to confusion matrix (#3450 ) * add num_classes arg to confusion matrix * update ConfusionMatrix test * final update)	2020-09-10 18:39:04 -04:00
Rohit Gupta	a1ea681c47	Fix batch_outputs with optimizer frequencies (#3229 ) * Fix batch_outputs with optimizers frequencies * optimizers * fix batch_outputs with optimizer frequencies * clean test * suggestion Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * chlog * failing doctest * failing doctest * update doctest * chlog Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-09-10 23:01:20 +02:00
William Falcon	5abf7d9123	ref: move lr_finder (#3434 ) * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder * ref: move lr_finder	2020-09-09 22:12:27 -04:00
William Falcon	b36c5e86d0	ref: trainer argparse 1/n (#3421 ) * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n * ref: trainer argparse 1/n	2020-09-09 12:31:17 -04:00
Patrick Orlando	656c1af0df	Get experiment_id from MLFlow only once instead of each training loop (#3394 ) * Get experiment_id from MLFlow only once instead of each training loop. * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * add test that asserts mlflow client is called to retrieve experiment id only once * make pep8 happy * logs Co-authored-by: Patrick Orlando <patrick.orlando@rea-group.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-09-09 11:38:26 +02:00
Adrian Wälchli	e245065fbc	limit auto scaling batch size to the size of the training dataset (#3271 ) * fix * fix and test * fix merge error * test for max dataset size * changelog * update docs * fix merge * unused imports * imports	2020-09-09 10:51:43 +02:00
William Falcon	8f6b115511	ref: added model connector (#3407 ) * ref: added model connector * ref: added model connector * ref: added model connector	2020-09-09 00:24:20 -04:00
William Falcon	722c44c7d0	ref: device to gpus (#3405 ) * ref: device to gpus * ref: device to gpus * ref: device to gpus * ref: device to gpus * ref: device to gpus	2020-09-08 22:14:17 -04:00
Travis Addair	091d37f968	Added check for apex AMP and unit tests for Horovod + AMP (#3404 ) * Added check for apex AMP and unit tests for Horovod + AMP * Changelog * Fixed order of Horovod and Apex optimizer wrapping	2020-09-08 20:30:57 -04:00
William Falcon	aaf26d70c4	ref: device parser (#3400 ) * ref: train loop refactors part 2: 1/n * ref: device parser * ref: device parser * ref: device parser * ref: device parser * ref: device parser * ref: device parser * ref: device parser * ref: device parser	2020-09-08 18:46:42 -04:00
William Falcon	ff5f099cb7	ref: remove inner train loop 1/n (#3397 ) * ref: remove inner train loop 1/n * ref: remove inner train loop 1/n	2020-09-08 12:05:00 -04:00
William Falcon	d438ad8a8d	ensure calling test multiple times does not change results (#3391 )	2020-09-07 22:25:12 -04:00
William Falcon	b76d9e5dd5	Refa22 (#3388 ) * ref: inner train loop (intermediate step) 20/n * ref: inner train loop (intermediate step) 21/n * ref: inner train loop (intermediate step) 21/n * ref: inner train loop (intermediate step) 21/n * ref: inner train loop (intermediate step) 21/n * ref: inner train loop (intermediate step) 21/n	2020-09-07 16:45:31 -04:00
William Falcon	0b5b70d6c9	ref: inner train loop (intermediate step) 17/n (#3376 ) * ref: inner train loop (intermediate step) 17/n * ref: inner train loop (intermediate step) 17/n * ref: inner train loop (intermediate step) 17/n	2020-09-07 09:31:42 -04:00
William Falcon	69e3f904df	ref: inner train loop (intermediate step) 16/n (#3375 ) * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n * ref: inner train loop (intermediate step) 16/n	2020-09-06 21:57:20 -04:00
William Falcon	7073de8a95	ref: inner train loop (intermediate step) 14/n (#3373 ) * ref: inner train loop (intermediate step) 14/n * ref: inner train loop (intermediate step) 14/n	2020-09-06 19:55:18 -04:00
William Falcon	85421466ab	ref: inner train loop (intermediate step) 10/n (#3369 )	2020-09-06 08:59:58 -04:00
Rohit Gupta	24809b0b26	Refactor GPUStatsMonitor to improve training speed (#3257 ) * Refactor GPUMonitor to improve training speed * added gpu ids to monitor * update tests * added deprecation warning * pep * fix test * fix docs * fix log_gpu_memory * move deprecation check * chlog * Update CHANGELOG.md * suggestions and fix Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-09-04 06:02:16 -04:00
Adrian Wälchli	48c22c8bad	update batch size in DataModule when auto scaling batch size (#3266 ) * fix datamodule hasattr * fix patch check * fix setattr * update docs * revert patch fix * changelog * fix datamodule passed in as fit arg * docs * set datamodule batch size in lightning_setattr * fix merge * check with has_attr * access datamodule via trainer * pass fit args down to tuner * docs * fix typos in docs Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2020-09-03 22:07:49 +02:00
Adrian Wälchli	4ad5a78dce	to_torchscript method for LightningModule (#3258 ) * script * docs * simple test * move test * fix doctest * no grad context * extend tests test test * datamodule test * clean up test * docs * name * fix import * update changelog * fix import * skip pytorch 1.3 in test * update codeblock * skip bugged 1.4 * typehints * doctest not working on all pytorch versions * rename TestGAN to prevent pytest interference * add note about pytorch version * fix torchscript version inconsistency in tests * reset training state + tests * update docstring * Apply suggestions from code review Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * update docstring, dict return * add docs to index * add link * doc eval mode * forward * optional save to file path * optional * test torchscript device * test save load with file path * pep * str * Commit typing suggestion Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> * skip test if cuda not available Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>	2020-09-03 20:24:44 +02:00
Rohit Gupta	4a22fca524	Changed LearningRateLogger to LearningRateMonitor (#3251 ) * Change LearningRateLogger to LearningRateMonitor * file rename * docs * add LearningRateLogger with deprecation warning * deprecated LearningRateLogger * move deprecation check * chlog Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-09-03 18:17:15 +00:00
HT Liu	d521c1b178	Fix: gather_all_tensors cross GPUs in DDP (#3319 ) * Fix: gather_all_tensors cross GPUs in metrics * add a test case for gather_all_tensors_ddp in #3253	2020-09-03 12:27:32 +02:00
William Falcon	0d90d53a81	ref: moving train loop to own object 2/n (intermediate steps) (#3313 ) * ref: moving train loop to own object 2/n (intermediate steps) * ref: moving train loop to own object 2/n (intermediate steps)	2020-09-01 21:06:40 -04:00
Nicki Skafte	b66ce88f0d	[metrics] Renaming of precision recall metric (#3308 ) * rename metrics * update docs	2020-09-01 14:59:33 -04:00
William Falcon	7d57f8d407	ref: move prepare_data to data connector (#3307 ) * ref: moved argparse code to central class * ref: moved argparse code to central class * ref: moved argparse code to central class	2020-09-01 14:59:09 -04:00
Lezwon Castelino	3910ad0330	bugfix/3185 transpose (#3252 ) * change t() to transpose() as xla devices do not support .t() on 1-dim tensor * detach tensor before copying * Revert "detach tensor before copying" This reverts commit `37cc7bbe` * changed dims * added test_result_obj_on_tpu * detach before copying * detach before copying * detach before copying * replace torch.cat with sum	2020-09-01 09:17:52 -04:00
William Falcon	805ff37e8c	ref: .tune() (temporary) (#3293 ) * ref: .tune() * ref: .tune() * ref: .tune() * ref: .tune() * ref: .tune() * ref: .tune()	2020-08-31 17:36:09 -04:00
William Falcon	caf7893f27	ref: modular is_overridden (#3290 ) * ref: modular is_overridden * ref: modular is_overridden * ref: modular is_overridden * ref: modular is_overridden	2020-08-31 12:12:02 -04:00
Carlos Mocholí	cc80749c7e	Parse Union[bool, str] arguments (#3235 ) * Parse Union[bool, str] arguments * Address review Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-08-29 10:39:42 -04:00
Jeremy Jordan	a5d1176cf6	callback method for on_save_checkpoint (#2501 ) * initial draft * fix test * Update pytorch_lightning/trainer/callback_hook.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * fix tests * remove old code * untested upgrade script * document limitations * clean up and add tests * Update pytorch_lightning/trainer/training_io.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * reflect PR comments * fix formatting * Update docs/source/callbacks.rst * clarify docs * revert change for loading checkpoints * small edits Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-08-28 16:50:52 +02:00
monney	d5254ff9df	warn user when dropping unpicklable hparams (#2874 ) * refactored clean_namespace * Update try except to handle pickling error * Consolidated clean_namespace. Added is_picklable * PEP8 * Change warning to use rank_zero_warn. Added Test to ensure proper hparam filtering * Updated imports * Corrected Test Case	2020-08-28 09:07:43 +02:00
Rohit Gupta	85cd558a3f	Follow up of #2892 (#3202 ) * Follow up of #2892 * typo * iterabledataset	2020-08-27 15:28:29 -04:00
Rohit Gupta	f03943ee94	Fix GpuUsageLogger to work on different platforms (#3008 ) * Fix GpuUsageLogger * docstrings * misconfigexception * add basic tests * skip doctest * fix parameter and docstring * rm cl * skip doctest * cleanup * chlog * add suggestions from review * add test from suggestions * fix import * fix test * fix test * fix test * fix test * rename GpuUsageLogger to GPUStatsMonitor * doc fix * Apply suggestions from code review * update docs format * update docs * miss * merge * fix title formatting * unindent * punctuation * simplify if statements * fix test * suggestions * pep * Update CHANGELOG.md Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * fix on_train_batch_* * use AttributeDict * usage * rank zero Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * import * minor changes Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Adrian Wälchli <adrian.waelchli@inf.unibe.ch> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-08-27 19:50:32 +02:00
William Falcon	f3c63f7746	tests to ensure correct dataloader calls (#3221 ) * tests to ensure correct dataloading interval and sequence * tests to ensure correct dataloading interval and sequence * tests to ensure correct dataloading interval and sequence * tests to ensure correct dataloading interval and sequence * tests to ensure correct dataloading interval and sequence	2020-08-27 09:49:46 -04:00
William Falcon	a1705441a9	ref: remove _evaluate fx (#3197 ) * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate * remove _evaluate	2020-08-26 12:28:14 -04:00
Lezwon Castelino	d9ea25590e	fix ONNX model save on GPU (#3145 ) * added to(device) * added test * fix test on gpu * Update pytorch_lightning/core/lightning.py Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * Update pytorch_lightning/core/lightning.py Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * remove multi gpu check Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * updated message * Update pytorch_lightning/core/lightning.py Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * updated test * onxx to onnx * Update pytorch_lightning/core/lightning.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/models/test_onnx.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * add no grad Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * add isinstance back * chlog * error is input_sample is not Tensor Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-08-26 16:22:19 +00:00
Sordie	888340d17e	Fix RMSLE metric (#3188 ) * fix rmsle * Updated test to match rmsle fix * Updated RMSLE example result to match functional * chlog * add randomized test * fix pep8 Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>	2020-08-26 08:02:53 -04:00
Nicki Skafte	17d8773106	New modular metric interface (#2528 ) * new base structure * missing packages * updated interface * revert some changes * fixes * add changelog * fix bug * added description * test for pickable * fixing test * fixing test * fix pickle issue * reduceop typehints back * remove redundant module arg * add save/load test * add aggregate method * text clarification * fix doctest * Apply suggestions from code review * change test to results obj * fix docs * formatting Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * formatting * pep * Update CHANGELOG.md * suggestions * fix tests * fix pep8 * fix tests Co-authored-by: Nicki Skafte <nugginea@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2020-08-26 13:01:29 +02:00
William Falcon	bda1400225	ref: restore on_eval_start hook (#3183 ) * restore eval loop hook	2020-08-26 00:45:43 -04:00
William Falcon	2f6d82e0e6	ref: remove on_eval_start hook (#3176 ) * remove on_eval_start hook * remove on_eval_start hook	2020-08-25 22:28:00 -04:00
William Falcon	6068b29d29	ref: remove obscure forward call in eval + CPU backend ___step (#3123 ) * remove obscure forward call in eval * remove obscure forward call in eval * remove obscure forward call in eval * remove obscure forward call in eval * remove obscure forward call in eval * remove obscure forward call in eval	2020-08-24 12:31:40 -04:00
Uladzislau Sazanovich	2d42ec008f	Make trainer.state a read-only property (#3109 ) * Make trainer.state a read-only property * Update states.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-08-24 16:49:33 +02:00
William Falcon	8d7ca5cd2c	ref: refactored gpu backend __step (#3120 ) * refactored gpu backend __step * refactored gpu backend __step * refactored gpu backend __step * refactored gpu backend __step	2020-08-24 09:22:05 -04:00
Jirka Borovec	45e7491dcc	drop packaging (#3105 )	2020-08-24 05:28:56 -04:00
s-rog	7b054399c6	fix tb hparams logging (#2974 ) * log_hyperparams add default metric also adds scalar support * fix typos and style * another typo * keep original logging implementation * remove missed line * fix capitalization * add step to leg_metrics for tests * disable hp metric none (-1) logging to pass tests * initial arg implementation * add step to log_metrics * add hp_metric case to log test * add docs and minor formatting * fix broken else * pep8 style * edit tests * Update pytorch_lightning/loggers/tensorboard.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update pytorch_lightning/loggers/tensorboard.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2020-08-24 06:57:04 +00:00
Rohit Gupta	34c88d127b	Fix log_graph in TensorBoardLogger (#3092 )	2020-08-22 06:35:09 -04:00
Rohit Gupta	7cca3859a7	Fix num_sanity_val_steps is clipped to limit_val_batches (#2917 ) * Fix num_sanity_val_steps according to limit_val_steps * fix test * add num_sanity_batches * pep * update docstring in test * add more test * chlog * update comments and docstring in test Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Adrian Wälchli <adrian.waelchli@inf.unibe.ch> Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai>	2020-08-21 20:11:31 +02:00
Jirka Borovec	bcdb750976	changelogs clean (#3082 ) * clean * ver	2020-08-20 22:58:53 +00:00
Nathan Raw	bab89b8d21	Add transfer_batch_to_device hook to DataModule (#3038 ) * ✨ add dm to_device logic in trainer * 🔥 remove unnecessary comment * ✨ add to_device logic to datamodule * ✅ add test * updated docs Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-08-20 08:47:11 -04:00
Peter Yu	cee5eaf659	flake8 fixes (#3064 ) * flake8 fixes * fix pep8 * fix pep8 Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-08-20 07:45:22 -04:00
Peter Yu	88886ace72	More robust way of collecting init argument names for LightningModules (#3066 ) When a LightningModule inherits from a class that implements `__new__()` such as `typing.Generic`, `inspect.signature(cls)` short-circuits and returns the signature of `__new__()` instead of `__init__()`. So, we need to be more specific and call inspection directly on the init function.	2020-08-20 07:19:11 -04:00
William Falcon	3453bba898	re-enabled naming metrics in ckpt name (#3060 ) * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name	2020-08-19 20:34:09 -04:00

1 2 3 4 5 ...

791 Commits