lightning

Commit Graph

Author	SHA1	Message	Date
Nicki Skafte	68fd3086f1	Prevent flickering progress bar (#6009 ) * add padding * fix * fix * Update pytorch_lightning/callbacks/progress.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * updated based on suggestion * changelog * add test * fix pep8 * resolve test * fix code format Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: tchaton <thomas@grid.ai>	2021-02-17 19:01:51 +00:00
chaton	e982800b81	Add PredictLoop (#5752 ) * integrate distrib_type * sync changes * sync * fixes * add forgotten generators * add missing logic * update * import * missed imports * import fixes * isort * mv f * changelog * format * move helper to parallel plugin * d * add world size * clean up * duplicate * activate ddp_sharded and tpu * set nvidia flags * remove unused colab var * use_tpu <-> on_tpu attrs * make some ddp_cpu and clusterplugin tests pass * Ref/accelerator connector (#5742) * final cleanup Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * connector cleanup Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * trainer cleanup Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * accelerator cleanup + missing logic in accelerator connector Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * add missing changes to callbacks Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * reflect accelerator changes to lightning module Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * clean cluster envs Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * cleanup plugins Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * add broadcasting Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * yapf * remove plugin connector Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * plugins * add predict_loop * manual optimization * clean predictloop * update optimizer routing * add predict loop on new accelerator * resolve a bug * add rank to torchelastic * add predict_loop * add predict loop on new accelerator * resolve a bug * fix memory mixed precision * update * setstate on trainer for pickling in ddp spawn * add predict_loop * clean predictloop * add predict loop on new accelerator * resolve a bug * add predict_loop * add predict loop on new accelerator * resolve a bug * add predict_loop * add predict loop on new accelerator * resolve a bug * add predict_loop * add predict loop on new accelerator * resolve a bug * add predict_loop * clean predictloop * add predict loop on new accelerator * resolve a bug * add predict_loop * add predict loop on new accelerator * resolve a bug * resolve tests * add predict method * add back commented accelerator code * adapt test for sync_batch_norm to new plugin * fix deprecated tests * fix ddp cpu choice when no num_processes are given * yapf format * skip a memory test that cannot pass anymore * remove sanetize * rename train to run_train * remove useless hooks * add misconfigurationException * remove wrong naming * resolve some legacy * udpate docstring * fix pickle error in spawn plugin * x * avoid * x * fix cyclic import in docs build * add support for sharded * update typing * add sharded and sharded_spawn to distributed types * make unwrap model default * refactor LightningShardedDataParallel similar to LightningDistributedDataParallel * update sharded spawn to reflect changes * update sharded to reflect changes * Merge 1.1.5 changes * fix merge * fix merge * yapf isort * fix merge * yapf isort * fix indentation in test * copy over reinit scheduler implementation from dev1.2 * fix apex tracking calls with dev_debugger * reduce diff to dev1.2, clean up * fix trainer config test when gpus>0 and num_processes >0 and ddp_cpu * sort plugin tests legacy/new * fix error handling for amp on cpu * fix merge fix merge fix merge * [Feat] Resolve manual_backward (#5837) * resolve manual_backward * resolve flake8 * update * resolve for ddp_spawn * resolve flake8 * resolve flake8 * resolve flake8 Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal> * fix tests/accelerator tests on cpu * [BugFix] Resolve manual optimization (#5852) * resolve manual_optimization * update * update Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal> * Remove copy trainer parameters to happen earlier within the loop and add safe guard to get ref model (#5856) * resovle a bug * Accelerator refactor sharded rpc (#5854) * rpc branch * merge * update handling of rpc * make devices etc. Optional in RPC * set devices etc. later if necessary * remove devices from sequential * make devices optional in rpc * fix import * uncomment everything * fix cluster selection Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal> * resolve bug * fix assert in rpc test * resolve a test * fix docs compilation * accelerator refactor - fix for sharded parity test (#5866) * fix memory issue with ddp_spawn * x x x x x x x x x * x * Remove DDP2 as this does not apply * Add missing pre optimizer hook to ensure lambda closure is called * fix apex docstring * [accelerator][BugFix] Resolve some test for 1 gpu (#5863) * update * revert init * resolve a bug * update * resolve flake8 * update * update * update * revert init * resolve a bug * update * resolve flake8 * update * update * update * update * update * revert init * resolve a bug * update * resolve flake8 * update * update * update * revert init * update * resolve flake8 * update * update * update * update * update * all_gather * update * make plugins work, add misconfig for RPC * update * update * remove breaking test * resolve some tests * resolve flake8 * revert to ddp_spawn Co-authored-by: root <root@ip-172-31-88-60.ec2.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal> Co-authored-by: Justus Schock <justus.schock@rwth-aachen.de> * yapf isort * resolve flake8 * fix apex doctests * fix apex doctests 2 * resolve docs * update drone * clean env * update * update * update * update * merge * Fix RPC related tests, clean out old API, update for new accelerator API [skip ci] (#5881) * Fix RPC related tests, clean out old API, update for new accelerator API * Move tests out of legacy folder, update paths and names * Update test_remove_1-4.py * Expose properties for tpu cores/gpus/num_gpus * Add root GPU property * Move properties to properties.py * move tests that were previously in drone * Fix root GPU property (#5908) * Move root GPU to property, remove horovod set as this is handled in horovod plugin, ensure we mock correctly to set GPU accelerator * Add missing tests back * fix best model path transfer when no checkpoint callback available * Fix setup hook order [wip] (#5858) * Call trainer setup hook before accelerator setup * Add test case * add new test * typo * fix callback order in test Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * rename ddp sequential -> rpc sequential for special test * revert * fix stupid merge problem * Use property in connector for sampler (#5913) * merge the import conflicts * fix spawning of processes in slurm * [wip] Fix some bugs for TPU [skip ci] (#5878) * fixed for single tpu * fixed spawn * fixed spawn * update * update * wip * resolve bugs * resolve bug * update on comment * removed decorator * resolve comments * set to 4 * update * update * need cleaning * update * update * update * resolve flake8 * resolve bugs * exclude broadcast * resolve bugs * change test * update * update * skip if meet fails * properly raise trace * update * add catch * wrap test * resolve typo * update * typo Co-authored-by: Lezwon Castelino <lezwon@gmail.com> Co-authored-by: Your Name <you@example.com> * resolve some tests * update * fix imports * update * resolve flake8 * update azure pipeline * skip a sharded test on cpu that requires a gpu * resolve tpus * resolve bug * resolve flake8 * update * updat utils * revert permission change on files * suggestions from carlos Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * remove unrelated formatting changes * remove incomplete comment * Update pytorch_lightning/accelerators/__init__.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * remove unrelated formatting change * add types * warn 1.7 ddp manual backward only if ddp kwarg unset * yapf + isort * pep8 unused imports * fix cyclic import in docs * Apply suggestions from code review * typer in accelerator.py * typo * resolve flake8 * update code * update * Update pytorch_lightning/trainer/predict_loop.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update pytorch_lightning/trainer/predict_loop.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * fix merge * fix merge * reset legacy accelerator * add missing rename dispatch * rename post traning * update code * resolved comments * typo * typo * add flow description * resolve comments * update on comments * update flow * add backticks * resolve tpu Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: justusschock <justus.schock@posteo.de> Co-authored-by: Justus Schock <justus.schock@rwth-aachen.de> Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com> Co-authored-by: SeanNaren <sean@grid.ai> Co-authored-by: root <root@ip-172-31-88-60.ec2.internal> Co-authored-by: Lezwon Castelino <lezwon@gmail.com> Co-authored-by: Your Name <you@example.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>	2021-02-16 17:11:56 -05:00
Jirka Borovec	79d42d83e7	formatting 3/n: PL modules (#5716 ) * cb * log * prof * tune * flake8	2021-02-08 14:28:38 -05:00
Adrian Wälchli	7b42494d0e	Fix visual progress bar bug / properly reset progress bar (#4579 ) * reset * fix reset * changelog * update chlog * typing Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2021-02-03 19:41:44 +01:00
chaton	3da28fd634	[feat] 1/2 Add trainer.predict (#5579 ) * start adding predict * add predict * resolve test * add predict * remove limit_predict * update * add test for predict * typo * update on comments * remove predict_step * update ddp_shareded * check ddp_sharded * resolve on comments * resolve isort * update dp * add test dp 1 gpu * made default forward * resolve path * resolve bug * update on comments * resolve doc * resolve bug * update * resolve bug * update on comments * resolve pep8 * update test doc * update on comments * solve special tests * resolve bug * resolve flake8 * Update pytorch_lightning/callbacks/progress.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * add predict to LightningModule * missing predict * typo * rename is_prediction to _predicting * add * update * update * update doc Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>	2021-01-27 11:38:14 -05:00
Jirka Borovec	dee5553b2b	move to Pages dir (#4869 ) * folders * common / advanced / extensions * paths * flake8 * isort Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>	2021-01-26 15:07:07 -05:00
Rohit Gupta	9cfbf8d609	Disable checkpointing, earlystopping and logging with fast_dev_run (#5277 ) * Disable checkpointing, earlystopping and logger with fast_dev_run * docs * chlog * disable callbacks and enable DummyLogger * add log * use dummy logger method * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> (cherry picked from commit `f740245521`)	2021-01-06 12:57:24 +01:00
Adrian Wälchli	89e8796e2a	fix incomplete progress bar when refresh_rate > num batches (#4577 ) * fix progress bar overshoot * fix updates for partially incomplete main progress bar when val loop starts * add tests * chlog	2020-11-24 00:01:33 +01:00
William Falcon	4c0d063c86	outputs in __batch_end hooks (#3966 ) * train_batch_end outputs * added tests for the output hooks	2020-10-07 21:48:38 -04:00
William Falcon	65b6a6a497	0.10.0 (#3965 )	2020-10-07 20:41:56 -04:00
Rohit Gupta	a628d181ee	Fix val_progress_bar total with num_sanity_val_steps (#3751 ) * Fix val_progress_bar total with num_sanity_val_steps * chlog * Fix val_progress_bar total with num_sanity_val_steps * move test * replaced with sanity flag and suggestions	2020-10-04 08:32:18 -04:00
Jeff Yang	951048a81e	docs: use ref for anchor links, fix a few typo (#3486 )	2020-09-13 21:04:21 -04:00
William Falcon	bda1400225	ref: restore on_eval_start hook (#3183 ) * restore eval loop hook	2020-08-26 00:45:43 -04:00
William Falcon	2f6d82e0e6	ref: remove on_eval_start hook (#3176 ) * remove on_eval_start hook * remove on_eval_start hook	2020-08-25 22:28:00 -04:00
Rohit Gupta	7cca3859a7	Fix num_sanity_val_steps is clipped to limit_val_batches (#2917 ) * Fix num_sanity_val_steps according to limit_val_steps * fix test * add num_sanity_batches * pep * update docstring in test * add more test * chlog * update comments and docstring in test Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Adrian Wälchli <adrian.waelchli@inf.unibe.ch> Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai>	2020-08-21 20:11:31 +02:00
Ananya Harsh Jha	10150fccb0	make progress bar match internal epoch counter (#3061 ) * fix 3018, 3032 * changed progress bar for 3032	2020-08-20 06:27:48 -04:00
William Falcon	f43028f3ae	added copyright notices (#3062 )	2020-08-19 22:03:22 -04:00
William Falcon	f82d7feb6c	updated hooks (#2850 ) * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks * modified hooks	2020-08-07 09:29:57 -04:00
William Falcon	b507c42c47	clarify batch hooks (#2842 ) * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook * modified hook	2020-08-05 20:01:30 -04:00
Rohit Gupta	84c507c4df	Fix max_batches with fast_dev_run. (#2581 ) * Fix fast_dev_run to run for all val_dataloaders * fast_dev_run check * changelog * explicit * limit_batches with fast_dev_run in init * add test * whitespace and comment fix * comment and assertion * added tests * Fix fast_dev_run to run for all val_dataloaders * fast_dev_run check * changelog * explicit * limit_batches with fast_dev_run in init * add test * whitespace and comment fix * comment and assertion * added tests * added tests * added tests * added tests * update rtol * Revert "update rtol" This reverts commit `4320329540`. * added tests Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-07-27 17:56:55 -04:00
Adrian Wälchli	1e68968ed7	support num_sanity_val_steps=-1 (#2246 ) * support sanity_val_step=-1 * fix list size * simplification * simplify * add test for num_sanity_val_steps=-1 * update test * update docs * extend tests to multiple dataloaders * changelog * Update tests/trainer/test_trainer.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * improve test * refactor the sanity check decision * fix merge * Update trainer.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-07-23 07:07:03 -04:00
Oliver Neumann	1a54ed6ad9	Checking ipywidgets is installed for ensure tqdm working (#2417 ) * Adding importing ipywidgets before importing tqdm.auto to make sure ipywidgets is installed. * Updated CHANGELOG.md * Updated ipywidgets importing checks to @awaelchli comments. Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-06-30 16:59:35 -04:00
Jirka Borovec	f278ac42c8	Revert/Fix: epoch indexing from 1, to be from 0 (#2289 ) * Revert "deprecated: epoch indexing from 1 (#2206)" This reverts commit `f94b919b` * chlog * grad index * Apply suggestions from code review * tests * fix * test	2020-06-19 23:39:53 -04:00
Paweł Biernat	3256fe4e5a	Update progress.py (#2268 ) Fixes a minor bug introduced in #2213	2020-06-19 15:47:39 -04:00
William Falcon	04c794ca72	[WIP] Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) (#2213 ) * fixed percent check for val/test * fixed percent check for val/test * fixed percent check for val/test * fixed percent check for val/test * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * overfit_pct now uses train loaders for val and test and does not shuffle * add on fit_start on fit_end hooks * add on fit_start on fit_end hooks * add on fit_start on fit_end hooks Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-06-17 08:03:28 -04:00
Jirka Borovec	f94b919b96	deprecated: epoch indexing from 1 (#2206 ) * epoch indexing from 1 * chlog * fix tests * fix tests * self.min_epochs	2020-06-16 06:33:41 -04:00
Ashraful Islam	e0a5aee3a3	fix porgressbar postfix order (#1874 )	2020-05-18 20:33:51 -04:00
William Falcon	eeb411144f	enable fast_dev_run without a validation loop (#1779 ) * fix val dataloader * Update evaluation_loop.py	2020-05-11 11:30:22 -04:00
William Falcon	88c086bbd2	Update progress.py	2020-05-11 09:48:15 -04:00
William Falcon	15c11fc848	fixes no val loader	2020-05-11 09:47:33 -04:00
Adrian Wälchli	d06d5e68b6	Fix typo in progress bar docs (#1680 ) * fix typo * Typo * typo Borda Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-05-02 09:08:21 -04:00
Adrian Wälchli	3e8f2d99a9	Progress bar callback (#1450 ) * squash and rebase sanity check hooks sanity check callback hook finish moved core progress bar functionality into callback wip remove duplicate merge clean up imports docs sanity check progress bar main sanity move callback calls init progrss bar callback configuration and docs changelog rate decorator pass process_position disable on rank > 0 position index is_enabled remove decorator refactor init tqdm bars callback method ordering cannot reset when disabled sequence -> list default values fix has no attr _time() move on_val_end to proper place fix the pickle issue update warning properties check for None remove old comment switch order pull out non-tqdm functionality into base class documentation for the base class docs fix refresh rate issue in validation restrict type hint of trainer arg more docs update trainer docs rst docs fix lines too long fix test add missing type hints fix typo move docstring to __init__ solves doctest failures remove doctest :(( can't fix the pickle error fix example simplify by saving trainer reference fix docs errors move docstring initial value multiple val checks per epoch simpler handling of inf dataset sizes update inf docs renamed training_tqdm_dict rename get_tqdm_dict rename occurences of tqdm update changelog fix doctest fix formatting errors added callback tests progress bar on off test more tests for progress bar weird test fix? add ignored property disable default progress bar in LR finder change enable/disable behavior trying doctest in CI again undo doctest pickle error undo doctest pickle error :(( remove progress_bar_callback Trainer arg and fix tests restore progress bar after auto lr find update docs fix rebase fix wrong negation * fix fast dev run total * more thorough testing * remove old args * fix merge * fix merge * separate tests * type hint total batches * reduce if Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * is_disabled Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * is_enabled Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * rename enabled/disabled * move deprecated api * remove duplicated test from merge * fix rename is_disabled * newline * test also testprogress for fast dev run Co-authored-by: J. Borovec <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-04-23 20:46:18 -04:00

32 Commits