lightning

Commit Graph

Author	SHA1	Message	Date
Aki Nitta	9da78a94bd	Rename `TPUSpawnPlugin` to `TPUSpawnStrategy` (#11190 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-12-21 16:36:16 +00:00
Adrian Wälchli	f5c2881b68	3/n Simplify spawn plugins: Merge `pre_dispatch` and `setup` logic (#11137 )	2021-12-20 17:41:22 +01:00
ORippler	86a3c5e2a3	Add required states for resumed ModelCheckpoint GC (#10995 ) * Add required states for resumed ModelCheckpoint GC * Add backwards compatibility with legacy cktps Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * Add test to check if attrs are written to ckpt Note that we do not yet check for proper loading/reinstantiation of ModelCheckpooint based on the ckpt written to disk * Test if attributes are restored properly from ckpt * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix broken `test_callbacks_state_fit_ckpt_path` `ModelCheckpoint` is configured to save after every epoch, but `trainer.fit` is called with `max_steps = 1` Note there may be a better way of doing this, where `ModelCheckpoint` is called after `training_step` * Update test_restore.py * Update test_restore.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Check that all attributes are restored properly * revert changes, use fix on master * Convert to proper unit test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Refactor `test_mode_checkpoint_saveload_ckpt` * First save, then load ckpt. * Instantiate ModelCheckpoint twice. Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-12-20 17:05:15 +01:00
Adrian Wälchli	29eb9cccf2	Rename the `TrainingTypePlugin` base to `Strategy` (#11120 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: four4fish <88516121+four4fish@users.noreply.github.com>	2021-12-20 12:50:11 +00:00
Rohit Gupta	61eb6230c2	Prune EvalModelTemplate (#11153 )	2021-12-19 13:08:43 +00:00
Rohit Gupta	860959fb3f	Enable logging hparams only if there are any (#11105 )	2021-12-17 19:40:56 +01:00
Carlos Mocholí	7e10f6d41f	Save the loop progress state by default (#10784 )	2021-12-17 16:00:27 +00:00
Carlos Mocholí	5932f52b2f	Avoid the deprecated `onnx.export(example_outputs=...)` in torch 1.10 (#11116 )	2021-12-17 10:11:11 +01:00
Adrian Wälchli	e19d93f69e	Initialize ModelCheckpoint state as early as possible (#11108 )	2021-12-17 00:18:29 +01:00
Adrian Wälchli	2b0075a47e	Teardown sync-batchnorm after training (#11078 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-12-16 18:58:44 +00:00
Rohit Gupta	61a744f5c6	Fix support for logging within callbacks returned from `LightningModule` (#10991 ) Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2021-12-14 19:41:29 +01:00
Aka.Fido	72cc8b7ca9	Disable validation completely when `overfit_batches>0` (#9709 ) Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2021-12-01 13:57:57 +00:00
Abhinav Arora	f63222d966	Remove references to torchtext.legacy from PyTorch Lightning (#10724 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-11-30 19:32:07 +00:00
Carlos Mocholí	38ed26ec5a	Do not require omegaconf to run tests (#10832 )	2021-11-30 14:48:03 +00:00
Carlos Mocholí	1b43e43e9f	Minor changes in preparation for saving the loops state (#10783 )	2021-11-30 19:37:04 +05:30
four4fish	8bf7f9cce7	1/n Move Accelerator into strategy - move batch_to_device to strategy (#10649 ) * 1/n Integrate Device Specific Accelerator Logic with strategy - move batch_to_device to strategy * add changelog * add model is not none check * Apply suggestions from code review Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update CHANGELOG.md * Update test_datamodules.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_hooks.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dp.py Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2021-11-29 12:11:21 -08:00
Carlos Mocholí	152eb57def	Rename special to standalone (#10779 )	2021-11-26 17:13:14 +00:00
Kaushik B	e0b4bb2ea3	Deprecate `DeviceType` in favor of `_AcceleratorType` (#10503 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-11-25 16:41:03 +01:00
Rohit Gupta	823bfa6f8a	Update `LightningModule` docs (#10637 )	2021-11-23 01:02:04 +05:30
Carlos Mocholí	0de8ab4f2e	Fix failing master due to an interction between PRs (#10627 )	2021-11-19 02:04:53 +00:00
Carlos Mocholí	35f6cbe09f	Use `update_wrapper` in test_hooks.py (#10578 )	2021-11-19 01:52:55 +01:00
Adrian Wälchli	1ff35ed0f5	Improve code quality in `AcceleratorConnector._configure_slurm_ddp` (#10102 )	2021-11-17 23:10:47 +00:00
Carlos Mocholí	0fa07da987	Fail the test when a `DeprecationWarning` is raised (#9940 )	2021-11-17 23:41:50 +01:00
Carlos Mocholí	ba036fdeea	Support special test parametrizations (#10569 )	2021-11-17 15:46:14 +00:00
Rohit Gupta	de7ef41fea	remove deprecated `reload_dataloaders_every_epoch` from `Trainer` (#10481 )	2021-11-16 06:47:43 +00:00
Carlos Mocholí	6dfcb6afc5	Skip strategy=ddp_spawn, accelerator=cpu, python>=3.9 tests (#10550 )	2021-11-16 10:06:47 +05:30
a-gardner1	ce149f6451	Fix support for dataclasses with ClassVar/InitVar in `apply_to_collection` (#9702 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2021-11-10 04:42:27 +00:00
Adrian Wälchli	a270a79ed9	Rename "master" methods to "main" in ClusterEnvironment plugins (#10103 ) * rename occurrences of master port, master address, maser node, master process * rename properties * add property decorators * occurrences in docs * update changelog * update changelog * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add lost method * create deprecation * add changelog * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix typo (but it was already there!!!) * Apply suggestions from code review Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * add todo * update more occurences * add types * add missing import Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>	2021-11-08 12:32:58 +00:00
puhuk	412f0a4d24	Remove deprecated dataloader arguments in Trainer methods (#10325 ) Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2021-11-04 11:03:39 +01:00
Carlos Mocholí	ba23d91320	Update recommendation on `dataloader_idx` (#10318 )	2021-11-04 01:39:55 +01:00
victorjoos	cc0e9f96a8	Add support for empty `gpus` list to run on CPU (#10246 ) Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2021-11-01 18:37:38 +00:00
Carlos Mocholí	81d15c5986	Implement double optimizer closure for hook structure consistency (#10167 )	2021-10-29 13:03:04 +00:00
Carlos Mocholí	03f01fb5ec	Fix gradient norm tracking and gradient clipping (#9287 ) * WIP * Progress * Undo test change * Fix plugin closure execution order * Update CHANGELOG * Fix manual optimization on AMP and skipping backward * Fix for deepspeed * Typo * Hook test for manual closure * Add skipping test with AMP * You are hideous, apex * Add deepspeed test * Update CHANGELOG * Fix for broken master * Add RunIf * FIXMEs * Rename * Fix grad norm * add a simple test * update test * update test * update test * fix merge conflicts * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Sea of changes * Undo change * Introduce TPUPrecisionPlugin * Undo changes * Undo changes * Resolve FIXME * Undo change * Undo change * Undo change * Fix FIXMEs * Fix FIXME * Correct value * Bad merge * Fix circular imports * WIP * Fixing clipping * Fixes * Bad merge * Move optimizer step and clipping into the `PrecisionPlugin` * Fix AMP * Update CHANGELOG * Fix tests * Underscore * Progress * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove pre_optimizer_step * Missed one * Progress * Progress * Fix test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update FIXMEs * Fix test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix test * DeepSpeed warning. mypy * Rename * Finish tests * Update CHANGELOG * Dumb fixes * accelerator=auto * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update on comments * Use ClassifModule Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-28 15:23:27 +00:00
Carlos Mocholí	5262b63dff	Pass the scaler as an input to `NativeMixedPrecisionPlugin` (#10055 ) Co-authored-by: thomas chaton <thomas@grid.ai>	2021-10-28 14:13:53 +00:00
Carlos Mocholí	dbe1662dc3	Replace `_TORCH_GREATER_EQUAL_DEV_1_10` with `_TORCH_GREATER_EQUAL_1_10` (#10157 )	2021-10-27 13:38:39 +01:00
Rohit Gupta	34d5980df6	Raise `MisconfigurationException` if `trainer.eval` is missing required methods (#10016 )	2021-10-25 23:12:08 -07:00
Rajat Goel	47e7a2860f	Fix Enums parsing in generated hparms yaml (#9170 ) Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-25 21:23:20 +00:00
Danielle Pintz	1f7bd6650c	Mark accelerator connector as protected (#10032 )	2021-10-25 19:24:54 +00:00
jjenniferdai	6d79184ec5	Unify checkpoint load paths [redo #9693 ] (#10061 )	2021-10-25 19:05:31 +00:00
Adrian Wälchli	76081fb846	Mark SLURM detection methods in `AcceleratorConnector` as protected (#10101 ) Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>	2021-10-25 17:52:15 +00:00
Carlos Mocholí	b376799430	Minor fixes related to clipping (#10130 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-25 16:40:22 +00:00
Adrian Wälchli	7eb2edf421	rename set_random_master_port (#10104 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-25 12:09:05 +00:00
Kaushik B	56bc55db71	Update strategy flag in docs (#10000 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2021-10-20 21:02:53 +05:30
Carlos Mocholí	f0b3e0f4de	Default to `precision=bf16` on CPU when `precision=16` is passed (#10033 )	2021-10-20 13:25:13 +00:00
Rohit Gupta	0aa220b46b	Remove deprecated `distributed_backend` from `Trainer` (#10017 ) * rm distributed_backend from Trainer * unused * chlog * internal distributed_backend * Docstring Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2021-10-19 13:54:37 +00:00
Kaushik B	5e8829b97d	(1/n) tests: Use strategy flag instead of accelerator for training strategies (#9931 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-10-16 20:40:25 +05:30
Carlos Mocholí	e973bcb76a	Use non-deprecated options in tests (#9949 )	2021-10-15 16:58:07 -07:00
Rohit Gupta	23e8b59ae7	Add `configure_gradient_clipping` hook in `LightningModule` (#9584 ) * init hook * docs * dep train args * update tests * doc * doc * .gitignore * not dep * add trainer args * add & update tests * fix tests * pre-commit * docs * add docs * add exception * code review * deepspeed * update tests * not * try fix * Apply suggestions from code review * update deepspeed * disable some tests * disable some tests * enable all tests	2021-10-13 20:15:13 +05:30
Kaushik B	05b15e63f0	Add `strategy` argument to Trainer (#8597 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2021-10-13 12:34:06 +00:00
ananthsub	28fc8d2016	Add `enable_model_summary` flag and deprecate `weights_summary` (#9699 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Kaushik B <kaushikbokka@gmail.com>	2021-10-13 17:20:54 +05:30

1 2 3 4 5 ...

420 Commits