lightning

Commit Graph

Author	SHA1	Message	Date
William Falcon	3453bba898	re-enabled naming metrics in ckpt name (#3060 ) * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name * re-enabled naming metrics in ckpt name	2020-08-19 20:34:09 -04:00
Nicki Skafte	cefc7f7c32	Feature/log computational graph (#3003 ) * add methods * log in trainer * add tests * changelog * fix tests * fix tests * fix tests * fix tests * fix tests * fix tests * fix tests * text * added argument * update tests * fix styling * improve testing	2020-08-19 19:08:46 -04:00
Adrian Wälchli	7b917de946	fix setting batch_size attribute in batch_size finder (finishing PR #2523 ) (#3043 ) * lightning attr fix * revert refactor * create test * separate test * changelog update * tests * revert * Update pytorch_lightning/trainer/training_tricks.py Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-08-19 19:01:55 -04:00
Adrian Wälchli	89a5d8fee9	fix auto scale batch size not working with precision=16 (#3045 ) * add test * test * test * add fix * changelog * check batch size changed	2020-08-19 20:41:33 +00:00
William Falcon	8315a65d0a	fix result obj dp auto reduce (#3013 ) * fix result for dp * fix result for dp * fix result for dp * fix result for dp * fix result for dp * fix result for dp * fix result for dp * fix result for dp * fix result for dp * fix result for dp * fix result for dp * fix result for dp * added warning when changing monitor and using results obj	2020-08-17 10:29:39 -04:00
William Falcon	465d4ffd2c	added lr scheduler test using dev debugger (#3004 ) * added lr scheduler test using dev debugger * added lr scheduler test using dev debugger * added lr scheduler test using dev debugger	2020-08-16 11:37:38 -04:00
Adrian Wälchli	188e06c261	ddp fix for trainer.test() + add basic ddp tests (#2997 ) * add ddp script variations * add ddp test * rename * shell * test * test * try call * try without subprocess * test * display the error * list all variations * try string * try copy env * debug * pythonpath * path * update test * change * simple ddp test * replace * remove random port * random port * str * clean up * check run spawn * clean up * docs * docs * update test * docs * changelog * changelog	2020-08-16 11:19:57 -04:00
William Falcon	44802f7697	tasks docs	2020-08-15 22:36:53 -04:00
William Falcon	d702d4d393	removed callback metrics from test results obj (#2994 ) * removed callback metrics from test results obj * removed callback metrics from test results obj	2020-08-15 21:45:41 -04:00
Jeff Yang	73ebd1066d	Fix accumulate_grad_batches for last batch (#2853 ) * first attempt * update changelog * fix pep8 and tests * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * added new tests * fixed tests * Apply suggestions from code review * used num_training_batches * fixed pep8 * fixed with is_last_batch suggested by @awaelchli * fixed with num_training_batches * fixed with num_training_batches * cleanup * fix test and update docs * fixed for alignment, update docs * minor changes * update doc Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-08-15 15:06:37 -04:00
William Falcon	7d36aac138	fix docs (#2987 )	2020-08-15 08:36:17 -04:00
William Falcon	b8371fa56c	Fixes #2972 #2946 (#2986 ) * add val step arg to metrics * add val step arg to metrics * add val step arg to metrics * add val step arg to metrics * add val step arg to metrics * add val step arg to metrics * add val step arg to metrics * add val step arg to metrics * add val step arg to metrics * add step metrics * add step metrics	2020-08-15 08:36:00 -04:00
Nathan Raw	b9695237f1	Save test predictions on multiple GPUs (#2926 ) * Save test predictions on multiple GPUs	2020-08-14 17:52:43 -04:00
William Falcon	e7794eb79a	Fixes #2407 (#2981 ) * fix gpus index error	2020-08-14 16:22:48 -04:00
William Falcon	48f658fbb5	Fixes #2943 (#2970 )	2020-08-13 21:44:55 -04:00
William Falcon	639a4cbd25	autoplay (#2968 )	2020-08-13 19:06:55 -04:00
Lezwon Castelino	cfd06a083b	Bugfix/2956 tpu distrib backend fix (#2959 ) * override dist backend when using tpus * added test * updated doc string * drop redundant info... * more redundant info Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2020-08-13 18:57:23 -04:00
William Falcon	b7fc805dcf	pep 8 (#2967 )	2020-08-13 18:56:02 -04:00
William Falcon	9a503de6af	Replace docs gifs with videos snippets so user can play at own speed (#2966 ) * update docs	2020-08-13 18:52:47 -04:00
Jeff Yang	07c023c32f	fix(docs): docstring for amp_backend (#2960 ) * fix(docs): docstring for amp_backend * fix(docs): early_stop_checkpoint -> early_stop_callback * docs Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai>	2020-08-13 23:25:56 +02:00
SiddhantRanade	88bfed371e	Fix enforce_datamodule_dataloader_override() for iterable datasets (#2957 ) This function has the if statement `if (train_dataloader or val_dataloaders) and datamodule:`. The issue is similar to that in https://github.com/PyTorchLightning/pytorch-lightning/pull/1560. The problem is that the `if(dl)` translates to `if(bool(dl))`, but there's no dataloader.__bool__ so bool() uses dataloader.__len__ > 0. But... dataloader.__len__ uses IterableDataset.__len__ for IterableDatasets for which __len__ is undefined. The fix is also the same, the `if dl` should be replaced by `if dl is not None`. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>	2020-08-13 23:06:17 +02:00
William Falcon	2c935d048e	track batch size (#2954 )	2020-08-13 12:40:54 -04:00
Jirka Borovec	4354690e55	add apex test (#2921 ) * add apex test * rename * level * events * wrap * evt * miss * apex * apex * apex * apex * apex * apex * Update tests/models/test_amp.py Co-authored-by: William Falcon <waf2107@columbia.edu> * notes * notes Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-08-13 10:03:13 -04:00
William Falcon	6c5a0a172f	Resultd (#2947 ) * updated docs	2020-08-13 09:58:05 -04:00
Jirka Borovec	519b97effd	Clean save (#2933 ) * thr deterministic=True * clean * clean * Apply suggestions from code review Co-authored-by: Vadym Stupakov <vadim.stupakov@gmail.com> * Apply suggestions from code review Co-authored-by: Vadym Stupakov <vadim.stupakov@gmail.com>	2020-08-13 07:26:33 -04:00
William Falcon	a46130cdc1	add weighted average to results obj (#2930 ) * track batch size in result obj	2020-08-12 08:02:00 -04:00
Brendan Fahy	56396abe98	fix checkpointing to remote file paths (#2925 )	2020-08-12 06:31:17 -04:00
William Falcon	d13e5c9e53	document lightiningmodule better (#2920 ) * updated docs	2020-08-11 19:39:43 -04:00
Brendan Fahy	97e6f35b34	fix missing return statement. Do not normalize remote paths (#2894 ) * fix missing return statement. Do not normalize remote paths * Update pytorch_lightning/utilities/cloud_io.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Add some documentation that we now support s3 and hdfs paths * suggestion from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>	2020-08-09 22:38:43 +00:00
Uladzislau Sazanovich	e9846dd758	Add tracking of basic states in Trainer [wip - to-be-merged after v0.9] (#2541 ) * Add initial tracking of states in Trainer. * Add INTERRUPTED state, improve tests, move state switching from callback to a trainer. * Move part of a trainer state switching to a decorator. * Add documentation. * Fix docs, rename state enum, restore state to previous on exit if None, add tests for decorator only. * Fix callback typing. Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-08-09 06:24:09 -04:00
Brendan Fahy	6e77181ec7	Squashed commit of the following: (#2164 ) commit 29fb0506cd38a15c359e369cc8bc4435916b0c78 Author: Brendan Fahy <bmfahy@gmail.com> Date: Sat Aug 8 19:35:30 2020 +0000 fix checking for version for docs to build commit 467fd640db02275972c7111af031c86bb59333e9 Author: Brendan Fahy <bmfahy@gmail.com> Date: Sat Aug 8 18:56:05 2020 +0000 remove no local test commit a7cc9f88de00feec1a5406874d05313c42bd004c Author: Brendan Fahy <bmfahy@gmail.com> Date: Sat Aug 8 18:46:44 2020 +0000 fix commit 3fdbb729da79ae9348c83410a138666bad467951 Author: Brendan Fahy <bmfahy@gmail.com> Date: Sat Aug 8 18:23:30 2020 +0000 revert requirements commit 9b8686bd83e2bc243cf329e26f1c667c6949cf67 Author: Brendan Fahy <bmfahy@gmail.com> Date: Sat Aug 8 18:16:42 2020 +0000 make it a fixture commit eec74953d24c8b25268d3b6dde3cc4affdd5cb8f Author: Brendan Fahy <bmfahy@gmail.com> Date: Sat Aug 8 18:01:32 2020 +0000 fix up the testing commit 896d94a0e60083d52c81db2a036b7f1e015cad11 Author: Brendan Fahy <bmfahy@gmail.com> Date: Sat Aug 8 17:47:28 2020 +0000 fix some tests commit 6d22bde19767bf2b71dfd44839b01efdf6888f83 Merge: 6175d4e2 `6ebe0d72` Author: Brendan Fahy <bmfahy@gmail.com> Date: Sat Aug 8 10:20:47 2020 +0000 Merge remote-tracking branch 'origin/master' into tb_use_gfile commit 6175d4e26b15a43c412c26d501762cd0b570616a Author: Brendan Fahy <bmfahy@gmail.com> Date: Fri Aug 7 10:16:36 2020 +0000 Use tensorboard.compat.gfile to support remote writing	2020-08-09 06:08:44 -04:00
William Falcon	256059a1d0	tracks all outputs including TBPTT and multiple optimizers (#2890 ) * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update * pl 0.9 update	2020-08-09 06:00:15 -04:00
Adrian Wälchli	1bb268ad8a	Clarify what gpus=0 means in docs (#2876 ) * docs clarify what gpus=0 means * add example suggested by @ydcjeff	2020-08-08 11:50:08 -04:00
Adrian Wälchli	f798cffd02	save last model after saving top_k when save_last=True (#2881 ) * save_last should be last * changelog * seed, docs * retrigger ci * compare filenames * move constants * fix test * epoch, global step * improve test	2020-08-08 06:02:43 -04:00
Jirka Borovec	a6e7aa7796	allow using apex with any PT version (#2865 ) * wip * setup * type * name * wip * docs * imports * fix if * fix if * use_amp * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * fix tests * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * fix tests * todos Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-08-08 11:07:32 +02:00
Santiago Castro	fed0ac838b	Fix Trainer arg name in docs (#2879 ) * Fix Trainer arg name in docs * Fix a PR comment	2020-08-08 07:52:35 +02:00
Jirka Borovec	b7d72706c3	clean imports (#2867 ) * clean imports * miss	2020-08-08 00:33:51 +02:00
Jirka Borovec	f8c058215f	simplify tests & cleaning (#2588 ) * simplify * tmpdir * revert * clean * accel * types * test * edit test acc Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update test acc Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-08-07 23:22:05 +02:00
Iz Beltagy	2cc60c625e	fix set_epoch on TPUs (#2740 ) * fix https://github.com/PyTorchLightning/pytorch-lightning/issues/2622 * Update training_loop.py	2020-08-07 09:31:30 -04:00
William Falcon	f82d7feb6c	updated hooks (#2850 ) * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks	2020-08-07 09:29:57 -04:00
ananthsub	b39f4798a6	Add support to Tensorboard logger for OmegaConf hparams (#2846 ) * Add support to Tensorboard logger for OmegaConf hparams Address https://github.com/PyTorchLightning/pytorch-lightning/issues/2844 We check if we can import omegaconf, and if the hparams are omegaconf instances. if so, we use OmegaConf.merge to preserve the typing, such that saving hparams to yaml actually triggers the OmegaConf branch * avalaible * chlog * test Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-08-07 09:13:21 -04:00
Rohit Gupta	a642349228	Support limit_mode_batches (int) for infinite dataloader (#2840 ) * Support limit_mode_batches(int) for infinite dataloader * flake8 * revert and update * add and update tests * pep8 * chlog * Update CHANGELOG.md Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Add suggestions by @awaelchli * docs * Apply suggestions from code review Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk> * Apply suggestions from code review * fix * max * check * add and update tests * max * check * check * check * chlog * tests * update exception message * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-08-07 13:02:36 +02:00
Nima Sarang	793036d29c	Support returning python scalars in DP (#1935 ) * Override the default gather method to support scalars * add computing average of a list * bug: change if to elif * add some tests * change style * change documentation * use apply_to_collection in DP gather * use apply_to_collection in DP gather * fix warning msg * override gather method in DP * add tests for python scalars * add python scalars to docstring * Update message * override gather method in DP * formatting * chlog Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-08-07 09:18:29 +02:00
Nicki Skafte	9a402461da	Bugfix: Lr finder and hparams compatibility (#2821 ) * fix hparams lr finder bug * add tests for new functions * better tests * fix codefactor * fix styling * fix tests * fix codefactor * Apply suggestions from code review * modified hook Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>	2020-08-07 00:34:48 +02:00
William Falcon	b507c42c47	clarify batch hooks (#2842 ) * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook	2020-08-05 20:01:30 -04:00
Ananya Harsh Jha	a5f2b89ed0	updated sync bn (#2838 ) * updated sync bn * updated sync bn * updated sync bn * updated sync bn * updated sync bn * updated sync bn * updated sync bn * updated sync bn * added ddp_spawn test * updated test * clean * clean Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-08-06 01:12:11 +02:00
William Falcon	5d0f0325d8	Revert "Support limit_mode_batches (int) for infinite dataloader" (#2839 ) * Revert "Support limit_mode_batches (int) for infinite dataloader (#2787)" This reverts commit `de9c9f0864`. * Update training_tricks.py	2020-08-05 15:57:26 -04:00
Ruotian(RT) Luo	bef27c58ed	save apex scaler states (#2828 )	2020-08-05 13:43:50 -04:00
Ruotian(RT) Luo	6034d5e37d	fix apex gradient clipping (#2829 )	2020-08-05 13:42:21 -04:00
Ananya Harsh Jha	e31c520c21	add support for sync_bn (#2801 ) * initial commit for sync_bn * updated changelog * tests * tests * ddp tests hanging with script tests * updated trainer * updated params * test * passingtests * passing tests * passing tests * passing tests * tests * removed apex * doc * doc * doc * doc * docs * tests * tests * tests	2020-08-05 13:29:05 -04:00

1 2 3 4 5 ...

562 Commits