lightning

Commit Graph

Author	SHA1	Message	Date
Mauricio Villegas	b7f3a3c421	Simple reproducibility with minimum boilerplate CLI training with `LightningCLI` (#4492 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-04-06 14:19:11 +01:00
thomas chaton	1302766f83	DeepSpeed ZeRO Update (#6546 ) * Add context to call hook to handle all modules defined within the hook * Expose some additional parameters * Added docs, exposed parameters * Make sure we only configure if necessary * Setup activation checkpointing regardless, saves the user having to do it manually * Add some tests that fail currently * update * update * update * add tests * change docstring * resolve accumulate_grad_batches * resolve flake8 * Update DeepSpeed to use latest version, add some comments * add metrics * update * Small formatting fixes, clean up some code * Few cleanups * No need for default state * Fix tests, add some boilerplate that should move eventually * Add hook removal * Add a context manager to handle hook * Small naming cleanup * wip * move save_checkpoint responsability to accelerator * resolve flake8 * add BC * Change recommended scale to 16 * resolve flake8 * update test * update install * update * update test * update * update * update test * resolve flake8 * update * update * update on comments * Push * pull * Update pytorch_lightning/plugins/training_type/deepspeed.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update pytorch_lightning/plugins/training_type/deepspeed.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * update * Apply suggestions from code review * Swap to using world size defined by plugin * update * update todo * Remove deepspeed from extra, keep it in the base cuda docker install * Push * pull * update * update * update * update * Minor changes * duplicate * format * format2 Co-authored-by: SeanNaren <sean@grid.ai> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>	2021-03-30 13:39:02 -04:00
Jirka Borovec	8cd75a4dd5	fix comparing versions (#6434 ) * fix comparing versions * chlog * . * ... * datasets	2021-03-23 07:51:45 +00:00
Jirka Borovec	156847bea7	CI: resume testing with py3.8 (#6516 ) * testing on python 3.8 * req	2021-03-15 12:07:23 +01:00
Jirka Borovec	38274b9de9	unfreeze torchtext version (#6302 )	2021-03-02 10:38:02 -05:00
Jirka Borovec	960a60743f	fix fairscale compatible with PT 1.8 (#5996 ) * try to extend fairscale available * 1.2	2021-02-16 19:43:02 +00:00
Jirka Borovec	9dd56398e3	fixing some compatibility with PT 1.8 (#5864 ) * change default * . * p * 0.21.2 * . * fix * .	2021-02-09 18:25:57 +01:00
Jirka Borovec	7e4d6cbe48	set minimal req. PT 1.4 (#5418 ) * set minimal req. PT 1.4 * chlog	2021-01-12 19:15:35 -05:00
chaton	7755572b4f	Check if optimizer supports closure (#4981 ) * check if optimizer support closure * cleanup test * resolve tests * resolve flake * update test due to patch limit * update * update dep * Update tests/core/test_lightning_optimizer.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/core/test_lightning_optimizer.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * resolve bug * update test * resolve tests * Update requirements/extra.txt Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * remove bolts dep * remove bolts * add missing bolts dep for tests * remove need for bolts Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-12-11 14:51:45 +01:00
chaton	ef8ef12fd0	[feat] pp 2/n (#5026 ) * Added changes for RPC plugin * Add missing kwargs * Fix code format * Loading refactors by introducing is_distributed var, fix optimizer step flow * Add rpc guard * Added docstrings and typing * resolve comments * Add additional rpc hook, refactor name of exit process hook for clarity * remove annotation * Modify behaviour to allow optional return, add test for rpc plugin * resolve tests * rename is_ddp_based * update * update for windows * update * resolve test * code smell * Added sequential plugin * resolve bug * update * cleanup * add Exception * resolve docs * Remove ddp support * Revert distributed -> ddp * Update pl_examples/basic_examples/conv_sequential_example.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pl_examples/basic_examples/conv_sequential_example.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/plugins/ddp_sequential_plugin.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Address code review points * Update pytorch_lightning/plugins/ddp_sequential_plugin.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/plugins/ddp_sequential_plugin.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Add missing return * Fix formatting, add datamodule args * add small comment * resolve comments * resolve comments * update source for fairscale * update extras * remove staticmethod * resolve flake8 * Skip tests that are failing due to bug upstream with multiple optimizers and shard * update * update on comments * clean test * latest comments * remove old comments * add todo * Update version * update * resolve bugs * resolve bugs * update test * remove hanging test * Update pytorch_lightning/plugins/ddp_sequential_plugin.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * resolve on comments * Update pytorch_lightning/plugins/ddp_sequential_plugin.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * resolve on comments * Update pytorch_lightning/plugins/ddp_sequential_plugin.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update pytorch_lightning/plugins/ddp_sequential_plugin.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update pytorch_lightning/plugins/ddp_sequential_plugin.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update pytorch_lightning/plugins/ddp_sequential_plugin.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * remove ImportError Co-authored-by: SeanNaren <sean@grid.ai> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2020-12-09 12:56:51 +00:00
Jirka Borovec	eeae426b33	CI: skip hanging (#4943 ) * CI: try increase time limit * try min 3.8 * no ex * CI * dep * test * deps * deps * drop * drop Co-authored-by: chaton <thomas@grid.ai>	2020-12-02 16:18:14 +00:00
Jirka Borovec	b2611b7dfa	drop sklearn dependency (#4912 ) * drop sklearn dependency * scipy Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2020-12-02 16:22:04 +01:00
Jeff Yang	563f9214fa	upgrade min deps (#4934 ) * upgrade min deps * unused * replace torchvision and torchtext * loggers * freeze pip Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>	2020-12-01 17:19:44 +00:00
SeanNaren	04bb0abe36	Merge branch 'master' into feature/plug # Conflicts: # pytorch_lightning/utilities/__init__.py # requirements/extra.txt	2020-11-27 10:00:05 +00:00
Jirka Borovec	217650320e	simplify imports Omegaconf (#4873 ) * hydra * omegaconf	2020-11-27 01:00:56 +01:00
SeanNaren	79527672cb	Remove amp check as guard now upstream	2020-11-26 10:13:27 +00:00
SeanNaren	a311ee17ab	Add fairscale requirement as zip before release	2020-11-25 18:16:36 +00:00
Travis Addair	51cc7a89ee	Horovod: fixed early stopping and added metrics aggregation (#3775 ) * Fixed early stopping for Horovod * Refactored to sync_dist_if_available * Bump min Horovod version to support hvd.is_initialized * Changelog * Added back change for Horovod * Removed redundant checks for initialization * Implement metrics gathering for Horovod * Added test for EvalResult * Renamed ddp_sync_on_step -> dist_sync_on_step * Added metric test for Horovod * Added option pass callable allgather function to metric base class * Added dist_sync_fn * Fixed calls to private _sync_dist * Fixed Horovod test * Added sync_tensor to the distributed backend * Skip Windows * Insert test path * Removed redundant import * Updated drone * Unset HOROVOD_GPU_ALLREDUCE * Unset * No cache dir * No uninstall * Unset variables * Uninstall Horovod during initialization * Replaced more references to ddp_sync_on_step * Fixed imports * Fixed attribute * Added back default * Lint * Added back docstring * Made gather_all_tensors default * Added whitespace * Update tests/models/test_horovod.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/metrics/metric.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update CHANGELOG.md Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-11-05 12:52:02 -05:00
Jirka Borovec	9fb5f4340e	try update horovod (#4004 )	2020-10-08 18:44:35 -04:00
Adrian Wälchli	d65b037a40	Mocking Loggers Part 5/5 (final) (#3926 ) * base * add xfail * new test * import * missing import * xfail if not installed include mkpatch fix test * mock comet comet mocks fix test remove dep undo merge duplication * line * line * convert doctest * doctest * docs * prune Results usage in notebooks (#3911) * notebooks * notebooks * revamp entire metrics (#3868) * removed metric Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * added new metrics Co-authored-by: Teddy Koker teddy.koker@gmail.com * pep8 Co-authored-by: Teddy Koker teddy.koker@gmail.com * pep8 Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * docs Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * docs Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * win ddp tests skip Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * win ddp tests skip Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * win ddp tests skip Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * win ddp tests skip Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * reset in compute, cache compute Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * reduce_ops handling Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * sync -> sync_dist, type annotations Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * wip docs Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * mean squared error * docstring * added mean ___ error metrics * added mean ___ error metrics * seperated files * accuracy doctest * gpu fix * remove unnecessary mixin * metric and accuracy docstring Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * metric docs Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * pep8, changelog Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * refactor dist utils, pep8 * refactor dist utils, pep8 Co-authored-by: Teddy Koker <teddy.koker@gmail.com> * Callback docs with autosummary (#3908) * callback docs with autosummary * do not show private methods * callback base docstring * skip some docker builds (temporally pass) (#3913) * skip some docker builds * todos * skip * use badges only with push (#3914) * testtube * mock test tube * mock mlflow * remove mlflow * clean up * test * test * test * test * test * test * code blocks * remove import * codeblock * logger * wandb causes stall Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai> Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Jeff Yang <ydcjeff@outlook.com>	2020-10-06 23:49:06 -04:00
Adrian Wälchli	db0e295f67	Complete mocking Comet and remove dep (#3910 ) * xfail if not installed include mkpatch fix test * mock comet comet mocks fix test remove dep undo merge duplication * line * line * convert doctest * doctest * docs	2020-10-06 19:50:42 -04:00
Adrian Wälchli	e0f8505394	Mocking loggers (part 2, neptune) (#3617 ) * mock neptune base tests * neptune doctest * remove extra * mock loggers * typo * mock import * neptune not compatible with multigpu * add back experiment	2020-10-04 21:20:06 -04:00
Jirka Borovec	a94728c99b	spec Horovod version (#3661 ) * spec Horovod version * MAKEFLAGS="-j2" * tests * CI * docker * CI * docker	2020-09-26 19:30:25 +02:00
Adrian Wälchli	3ff5327e83	Mocking loggers (part 1, wandb) (#3596 ) * mocking for wandb * remove wandb import in amp test * mock loggers in sphinx * check tests * Update extra.txt * setup * dev * min * revert Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-09-25 16:00:02 +02:00
Jirka Borovec	7b64472ced	fix lib paths after Wandb 0.10 (#3520 ) * try * try * drop 0.20 * drop 0.19.5 * -U * Fixed Horovod in CI due to wandb==0.10.0 sys.path modifications (#3525) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * format * wb freeze * types Co-authored-by: Travis Addair <taddair@uber.com>	2020-09-17 08:37:49 -04:00
Nathan Hunt	234e2b590f	Use .comet.config file for CometLogger (#1913 ) * Use .comet.config file or env var for API key. * Make CometLogger API key changes backwards compatible. * Fix line too long. * Add documentation about loading from ~/.comet_config. * Update required comet_ml version. * Comet logger: allow offline experiments with config file. This adds a new argument to the logger to control the online / offline mode explicitly so that if you give an API key and a save_dir (e.g. to control where checkpoints go while having ~/.comet.config) you can specify which mode you want. * Make CometLogger API key changes backwards compatible. * Comet logger: change online argument to be offline. For consistency with other loggers. * chlog Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-08-07 09:46:50 +02:00
Nicki Skafte	e3732789d7	Add remaning sklearn metrics (#2562 ) * added balanced accuracy * added dcg score * added mean absolute error * added mean squared error * fix * added mean squared log error * add median absolute error and r2 score * switch arguments * added mean poisson deviance * add mean gamma deviance and mean tweedie deviance * fix styling * added explained variance score * added cohen kappa score * added hamming, hinge, jaccard * fix styling * update sklearn requirement to newer version * update requirement * fix doctest * fix tests * added balanced accuracy * added dcg score * added mean absolute error * added mean squared error * fix * added mean squared log error * add median absolute error and r2 score * switch arguments * added mean poisson deviance * add mean gamma deviance and mean tweedie deviance * fix styling * added explained variance score * added cohen kappa score * added hamming, hinge, jaccard * fix styling * update sklearn requirement to newer version * fix doctest * fix tests * fix doctest * fix failing docs * fix test * trying to fix errors * Apply suggestions from code review * format Co-authored-by: Nicki Skafte <nugginea@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-08-05 11:32:53 +02:00
Lezwon Castelino	b7afac351b	Add onnx export (#2596 ) * export model to onnx * prepare data before exporting * support for dataloaders and tensors * added tests * use example_input_array add to changelog * updated docstring * added onnx inference tests * temp commit * removed schema valid test * add onnxruntime to environment.yml * moved onnxruntime to environment.yml pip * add example in doc * add lines between code block * added PR to changelog * is file check Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * remove * Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * infer example outputs * added doctest for onnx * fix windows tests * moved eval within condition block * self.forward to self * added docs * fixed docs error * added to toctree * Update CHANGELOG.md Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-07-31 12:27:57 +02:00
Jirka Borovec	bc833fbf52	Horovod & py3.8 (#2764 )	2020-07-30 23:39:07 +02:00
Jirka Borovec	40337cce58	freeze PT 1.5 for Horovod issue (#2744 ) * freeze pt 1.5 * torchtext * Apply suggestions from code review Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com> * timeout Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>	2020-07-28 15:52:23 -04:00
Jeff Yang	0a65826462	metrics: add BLEU (#2535 ) * metrics: added bleu score and test bleu * metrics: fixed type hints in bleu * bleu score moved to metrics/functional/nlp.py * refactor with torch.Tensor * Update test_sequence.py * refactor as Borda requests and nltk==3.2 * locked nltk==3.3 * nltk>=3.3, parametrized smooth argument for test * fix bleu_score example * added class BLEUScore metrics and test * added class BLEUScore metrics and test * update CHANGELOG * refactor with torchtext * torchtext changed to optional import * fix E501 line too long * add else: in optional import * remove pragma: no-cover * constants changed to CAPITALS * remove class in tests * List -> Sequence, conda -> pip, cast with tensor * add torchtext in test.txt * remove torchtext from test.txt * bump torchtext to 0.5.0 * bump torchtext to 0.5.0 * Apply suggestions from code review * ignore bleu score in doctest, renamed to nlp.py * back to implementation with torch * remove --ignore in CI test, proper reference format * apply justus comment Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-07-22 09:58:24 -04:00
Jirka Borovec	75f0a2062c	move torchtext as optional (#2395 ) * torchtext * Update pytorch_lightning/utilities/apply_func.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update apply_func.py Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-06-27 20:15:10 -04:00
Mateusz Pieniak	e82d9cdb66	Support torchtext on a single GPU (#2379 ) * Handle torchtext.data.Batch on GPU * Update CHANGELOG.md * Apply code review requests * Correct the docs * Change requirements	2020-06-27 16:36:45 -04:00
Jirka Borovec	41f5df18a4	move Trains logger to Bolts (#2384 ) * move Trains logger * chlog	2020-06-27 09:14:05 -04:00
Jirka Borovec	bfaabd7b7f	clean requirements (#2128 ) * clean requirements * missing * missing * req * min * default >> base * base.txt	2020-06-13 10:15:22 -04:00

35 Commits