lightning

Commit Graph

Author	SHA1	Message	Date
Ivan Švogor	25b771ca08	Create the loss accumulator directly on the device (#12430 ) Co-authored-by: Ivan Svogor <ivan.svogor@iarai.ac.at> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-03-25 12:46:17 +01:00
edward-io	87bd54aedf	fix typos (#11937 )	2022-02-16 17:27:51 -08:00
Carlos Mocholí	963adc7857	Small cleanup when dataloader states are saved (#11843 )	2022-02-16 20:57:21 +00:00
Rohit Gupta	59ef66c06b	Fix support for `CombinedLoader` while checking for warning raised with eval dataloaders (#10994 )	2021-12-14 20:43:23 +05:30
Carlos Mocholí	0061619e0a	Improve typing for loops (#10780 )	2021-11-30 20:28:55 +00:00
thomas chaton	3d6262b7a9	Fault Tolerant Manual: Add support for DDP (#10638 )	2021-11-25 18:31:53 +01:00
thomas chaton	b28ab34ff5	Fault Tolerant Manual: Add loading to reload the states (#10699 ) Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2021-11-23 17:18:36 +00:00
thomas chaton	8d810d6144	Enable distributed training with CombinedDataLoader and max_size_cycle (#10374 ) * solve combinedloader * update * update changelog * update on comments * resolve iterable dataset support * update test description * update * update on comments * update * Accelerator auto * Address review * Refactor Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2021-11-09 20:06:10 +00:00
Gili Tzabari	a967b6eba0	del iterator on_run_end() (#9915 )	2021-10-29 16:29:44 +00:00
thomas chaton	5841ca9782	[Feat] Add auto_restart for fault tolerant training (#9722 )	2021-10-01 16:37:17 +00:00
thomas chaton	9148a13de0	Enable DataLoader state restoration for the evaluation loop (#9563 ) Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2021-09-24 16:21:00 +00:00
Jirka Borovec	6e124e7207	CI: precommit - docformatter (#8584 ) * CI: precommit - docformatter * fix deprecated Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2021-09-06 12:49:09 +00:00
Adrian Wälchli	a5e2f2b432	fix state extraction from batch when fault-tolerant training (#9281 )	2021-09-02 11:57:40 -07:00
Adrian Wälchli	b13749b4ec	add fault-tolerance for global random state in map-style datasets (#8950 ) Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2021-08-26 12:13:31 +00:00
Adrian Wälchli	38ceb8943e	add docs (#8952 )	2021-08-18 10:33:42 +00:00
Adrian Wälchli	522df2b89b	3/n integrate new LightningDataFetcher into loop (#8953 ) Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>	2021-08-17 21:42:22 +00:00
ananthsub	037a86c873	Remove write_predictions from LightningModule (#8850 ) * Remove write_predictions from LightningModule	2021-08-14 02:00:23 +00:00
thomas chaton	e060547230	[Bug] Add SharedCycleIteratorState (#8889 )	2021-08-13 19:06:56 +01:00
Carlos Mocholí	e63968ab88	Add `pyupgrade` to `pre-commit` (#8557 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2021-07-26 14:38:12 +02:00
Carlos Mocholí	a64cc37394	Replace `yapf` with `black` (#7783 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2021-07-26 13:37:35 +02:00
Carlos Mocholí	f7027a8701	Remove `torch >= 1.6` checks (#8523 )	2021-07-23 04:03:20 +00:00
thomas chaton	374fae59ef	[Feat] Add utilities for CombinedLoader state dict and dataloader state dict 1/n (#8364 ) Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Justus Schock <justus.schock@posteo.de> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-07-19 09:56:57 +00:00
deepsource-autofix[bot]	03154eb30a	Refactor unnecessary `else` / `elif` when `if` block has a `return` statement (#8156 ) Co-authored-by: deepsource-autofix[bot] <62050782+deepsource-autofix[bot]@users.noreply.github.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz>	2021-06-28 15:27:41 +05:30
deepsource-autofix[bot]	e11fe19673	Remove unnecessary use of comprehension (#8149 ) Co-authored-by: deepsource-autofix[bot] <62050782+deepsource-autofix[bot]@users.noreply.github.com>	2021-06-27 10:00:02 +01:00
Justus Schock	7b283e3c46	Bugfix/Multiple dataloaders (#7433 ) * Update supporters.py * Update apply_func.py * Update supporters.py * Update model_train_dataloaders.py * Update model_train_steps.py * Update test_dataloaders.py * Update CHANGELOG.md * Update model_train_steps.py * Update test_dataloaders.py * Update test_dataloaders.py * Update supporters.py * Update test_supporters.py * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update tests/trainer/test_dataloaders.py Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> * Apply suggestions from code review Co-authored-by: Edgar Riba <edgar.riba@gmail.com> * Update supporters.py * Update supporters.py * Apply suggestions from code review Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Edgar Riba <edgar.riba@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-05-11 16:33:29 +02:00
Adrian Wälchli	b9b3fa371f	fix case where an IterableDataset doesn't produce a batch for an epoch (#7294 ) * wip * fix * add test * refactor + test * rm * formatting * update changelog * doc * docstring * remove unused import * Update CHANGELOG.md Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>	2021-04-30 12:45:55 +00:00
thomas chaton	9beec26c3e	[bugfix] Add support for CombinedLoader in validation with ddp (#7102 ) * add test * add changelog * resolve flake8 * remove print	2021-04-20 08:22:02 +00:00
Carlos Mocholí	f0c5479de9	Remove legacy `Result` parameters (#6016 )	2021-03-28 11:55:08 +02:00
Jirka Borovec	aba212341a	formatting 4/n: Trainer (#5720 ) * yapf trainer * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * . * fix Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-02-03 09:25:42 +00:00
Jirka Borovec	c3587d39da	prune deprecated EvalResult (#5633 ) * prune EvalResult * drop tests * drop usage * drop class * prune	2021-01-26 03:09:39 -05:00
Jirka Borovec	2846322f60	fix docs render (#5610 )	2021-01-25 20:21:00 -05:00
Justus Schock	ef7345dc4e	add possibility for nested loaders (#5404 ) * add possibility for nested loaders * pep8: newline	2021-01-24 07:32:02 -05:00
Arnaud Gelas	a9d9f33a86	Fix isort failures in trainer (#5529 ) Remove from skipped module in pyproject.toml and fix failures on: - pytorch_lightning/trainer/*.py	2021-01-18 13:42:50 -05:00
Loi Ly	1d13943605	Fix reset TensorRunningAccum (#5106 ) * Fix reset TensorRunningAccum * add test for TensorRunningAccum's reset method * fix CI failed due to PEP8 Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-01-05 09:58:36 +01:00
Jirka Borovec	c72880f109	hotfix: dataloaders - add unimplemented methods (#5352 ) * add unimplemented methods * test * test * flake8	2021-01-05 03:41:20 -05:00
Justus Schock	d88cf4a652	Add Support for multiple train loaders (#1959 ) * add support for wrong dtype in apply_func * apply loader resetting to possible collection of loaders * add combined loader iter class * integrate combined loader iter to training loop * fix imports * fix imports * finish supporters * add tests for supporters * add test for model with multiple loaders * fix trainer integration * fix instance check * Train loaders (#4032) * patch for issues discussed in #1959, encapsulating underlying datastructures returned from train_dataloader * update data_loading.py to it uses patch discussed in #1959 * rename class * Separate CombinedLoaderIterator into two classes, and update related tests. (#4606) * Fix the bugs after rebasing. * Add custom get_len for apply_to_collection * Refactor MultiIterator to be as CombinedLoaderIterator * To get the right num_training_batches. Call the wrapper for multi trainloader in data_loading.py, instead of training_loop.py * Reload _loader_iters when calling __iter__ * Don't transform DataLoader to CombinedLoaderIterator when it's along * Updates test_fit_multiple_train_loaders for testing num_training_batches * Seperate CombinedLoaderIterator into CombinedLoaderIterator and CombinedDataLoader. Add CombinedDataset for unified DataLoader format. * Initialize CombinedDataLoader before calculating num_training_batches. Also updating self._worker_check for multiple loaders * Update tests for supporters * Update tests for multiple trainloaders. Add tests about few_workers for multiple loaders. * Fix pep8 issues * Add tests for train_loader_patch.py * Add descriptions to multiple_trainloader_mode * Remove unused variables * Add docstrings and typing * Add more tests for better converage * Remove unused commented codes * Add sampler property * Remove extract_dataset * Update typing * pep8 * Update train_loader_patch.py * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/supporters.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * reviewer comments * fix stupid import * add docs * add back line separator * fix line sep * pep8 * Apply suggestions from code review * fix * fix * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Apply suggestions from code review Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * flake8 Co-authored-by: Justus Schock <justusschock@justuss-mbp.fritz.box> Co-authored-by: Christofer Fransson <christofer_fransson@yahoo.com> Co-authored-by: YI-LIN SUNG <r06942076@ntu.edu.tw> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>	2021-01-04 19:57:53 +00:00
Jirka Borovec	0f36525e8f	fix/enable - check F401 (#5201 ) * refactor - check F401 * missed * fix	2020-12-21 10:15:04 +01:00
Justus Schock	ebbf256bf5	Create memory dynamically (#4938 ) * create window size dynamically. * pep8 Co-authored-by: chaton <thomas@grid.ai>	2020-12-02 01:05:12 +05:30
chaton	958aa1aee7	[test] Accumulated gradient optimization tests (#4477 ) * adding tests * wip * update * Update tests/trainer/test_trainer.py Co-authored-by: Sean Naren <sean.narenthiran@gmail.com> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>	2020-11-02 23:44:11 +00:00
ananthsub	d3f40d6a9e	Update to_disk to use fsspec for remote file support (#3930 ) * Update supporters.py * Update CHANGELOG.md * Update supporters.py * Update supporters.py * Update supporters.py * Update supporters.py * Update supporters.py * Update supporters.py * Update CHANGELOG.md Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-10-07 07:28:23 -04:00
William Falcon	f43028f3ae	added copyright notices (#3062 )	2020-08-19 22:03:22 -04:00
Nathan Raw	b9695237f1	Save test predictions on multiple GPUs (#2926 ) * Save test predictions on multiple GPUs	2020-08-14 17:52:43 -04:00
William Falcon	6d10ac2ac8	Structured results (train loop only. val loop separate PR) (PR 2/5) (#2615 ) * r * r * r * patched optimizer closure with sr * patched optimizer closure with sr * patched optimizer closure with sr * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added train step structured result * added autoreduce for train step * added auto reduce on train * added auto reduce on train * added auto reduce on train * added auto reduce on train * added auto reduce on train * added auto reduce on train * added hooks * added hooks * added hooks * added hooks * added hooks * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * cache * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * Update pytorch_lightning/callbacks/early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py * Update pytorch_lightning/core/step_result.py * finished tests for structured results on train epoch * finished tests for structured results on train epoch * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * simple * finished tests for structured results on train epoch * simple * simple * revert * finished tests for structured results on train epoch * finished tests for structured results on train epoch * Update tests/base/deterministic_model.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * finished tests for structured results on train epoch * docstring typos * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * finished tests for structured results on train epoch * Update pytorch_lightning/core/step_result.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update pytorch_lightning/overrides/data_parallel.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka <jirka@pytorchlightning.ai> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>	2020-07-20 19:00:20 -04:00
Udit Arora	08573d0f7e	Fix some pyright member access errors in training module (#2121 ) * Fix pyright member access errors in training module * Fix Trainer instantiation error due to inheritence order * Add GH workflow for pyright * Fix more pyright errors in trainer module * Add pyrightconfig and setup python environment in type-check workflow * Exclude pyrightconfig.json * suggestions Co-authored-by: Jirka <jirka@pytorchlightning.ai>	2020-06-12 17:23:18 +02:00
Alexey Karnachev	4c34d16a34	Fixed configure optimizer from dict without "scheduler" key (#1443 ) * `configure_optimizer` from dict with only "optimizer" key. bug fixed * autopep8 * pep8speaks suggested fixes * CHANGELOG.md upd	2020-04-10 11:43:06 -04:00
Alexey Karnachev	ddbf7de6dc	Added accumulation of loggers' metrics for the same steps (#1278 ) * `add_argparse_args` method fixed (argument types added) * autopep8 fixes * --gpus=0 removed from test (for ci tests) * Update pytorch_lightning/trainer/trainer.py Co-Authored-By: Joe Davison <joe@huggingface.co> * test_with_accumulate_grad_batches added * agg_and_log_metrics logic added to the base logger class * small format fix * agg metrics strategies removed (not to complicate stuff) * agg metrics: handle zero step * autopep8 * changelog upd * flake fix * metrics aggregators factored out, metrics_agg.py added + tests * metrics agg default value added * Update pytorch_lightning/loggers/metrics_agg.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * metrics aggregators factored out, metrics_agg.py added + tests * metrics agg default value added * Update pytorch_lightning/loggers/metrics_agg.py Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * remove .item which causes sync issues (#1254) * remove .item which causes sync issues * fixed gradient acc sched * fixed gradient acc sched * test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored * test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored * autopep8 * loggers base.py types fixed * test * test * metrics aggregation for loggers: each key now has a specific function (or default one) * metrics aggregation for loggers: each key now has a specific function (or default one) * docstrings upd * manual typehints removed from docstrings * batch_size decreased for test `test_with_accumulate_grad_batches` * extend running accum * refactor * fix tests * fix tests * allowed_types generator scoped * trainer.py distutils was imported twice, fixed * TensorRunningAccum refactored * TensorRunningAccum added to change log (Changed) * change log pull link added Co-authored-by: Joe Davison <joe@huggingface.co> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>	2020-04-08 08:35:47 -04:00
Paweł Rzepiński	b8ff9bc1d2	Fix unimplemented type() on TPU (#1396 ) * Fix unimplemented type() on TPU * Add changelog entry * Add quotation marks	2020-04-06 20:29:55 -04:00
Jirka Borovec	31017120fd	fix incomplete RunningMean (#1309 ) * fix RunningMean * changelog * fix none * Update supporters.py just needed to multiply by zero for init * Revert "Update supporters.py" This reverts commit `7e0da6c6` * fix NaN * formatting Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-03-30 18:28:31 -04:00

48 Commits