Commit Graph

1461 Commits

Author SHA1 Message Date
ananthsub bbf27ed09a
Use fsspec in checkpoint connector for fault-tolerant training (#11776) 2022-02-07 13:29:41 +01:00
Rohit Gupta 7ec1e66e17
reduce only loss with dp (#11594)
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-02-07 17:00:29 +05:30
Krishna Kalyan f509e40ae3
Deprecate `on_before_accelerator_backend_setup` callback hook (#11655)
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2022-02-07 11:07:21 +00:00
ananthsub a64438c897
Centralize rank_zero_only utilities into their own module (#11747)
* Centralize rank_zero_only utilities into their own module

Fixes #11746

* PossibleUserWarning

* Update test_warnings.py

* update imports

* more imports

* Update CHANGELOG.md

* Update mlflow.py

* Update cli.py

* Update api_references.rst

* Update meta.py

* add deprecation tests

* debug standalone

* fix standalone tests

* Update CHANGELOG.md
2022-02-07 08:09:55 +00:00
Dan Dale 9d8faecdb2
Allow Horovod `teardown()` to complete gracefully if exception thrown in callback setup (#11752) 2022-02-05 11:13:21 -08:00
Dan Dale 3bc2407239
Allow access to ckpt_path within context of fit() (#11696)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-05 05:23:16 +01:00
Carlos Mocholí 7da931d1ca
Support no pre-fetching (#11606) 2022-02-05 03:59:46 +00:00
Andres Algaba 58324b5197
Improve the result printing at the end of evaluation (#11332)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-02-05 03:03:22 +01:00
NathanGodey 8a1b1eeef8
WandbLogger's log_image can use step argument (#11716)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-05 01:02:41 +00:00
wangraying 8c07d8bf90
Add `Trainer(strategy="bagua")` (#11146)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-02-04 17:02:09 +00:00
Rohit Gupta 4d72110b51
Deprecate `on_batch_start/on_batch_end` callback hooks (#11577) 2022-02-03 19:51:56 +00:00
Rohit Gupta 400201712f
added warning for distributedsampler in case of evaluation (#11479) 2022-02-03 18:42:13 +00:00
Rohit Gupta d132a9c3b7
Update CHANGELOG after the 1.5.9 release (#11558) 2022-02-03 14:25:27 +00:00
Rohit Gupta 01abe72278
Fix to avoid val progress bar disappear after validate (#11700)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-03 13:35:38 +00:00
Rohit Gupta e9065e9d42
Fix rich with uneven refresh rate tracking (#11668)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-03 10:27:05 +00:00
Rohit Gupta 7948ed703d
Avoid enforcing `shuffle=False` for eval dataloaders (#11575) 2022-02-03 09:35:31 +00:00
Danielle Pintz 9ebd7df22a
Move progress bar disabling out of the Trainer (#11377)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-02-03 06:29:32 +00:00
Rohit Gupta 0cb64fb8ba
Fix mid-epoch warning call while resuming (#11556)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-02-03 05:42:31 +00:00
four4fish d43fd0d4d6
Lazy initialize Strategy.parallel_devices (#11572)
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-03 04:25:16 +00:00
Rohit Gupta eceefdc602
Fix rich progress bar render only on main pbar (#11690) 2022-02-03 04:18:07 +00:00
Chunyang Wen 1c3ba7559d
Add change log for Accelerator (#11591) 2022-02-03 04:10:34 +00:00
Anton Schwaighofer f935319622
Allow a `CombinedLoader` as the training data in DDP (#11648)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-02-03 04:01:20 +00:00
Jirka Borovec c5de105276
fix available modules (#11526) 2022-02-03 03:38:16 +00:00
Carlos Mocholí 3d3172d3da
[CLI] Support shorthand for loggers (#11533) 2022-02-03 02:58:14 +00:00
Piyush Hirapara 72f0e5bfae
Deprecate `on_configure_sharded_model` callback hook for v1.6 (#11627)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Danielle Pintz <38207072+daniellepintz@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-02-03 02:29:26 +00:00
Krishna Kalyan 6586dd23b7
Mark `CheckpointConnector` as protected (#11550)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-03 02:26:08 +00:00
Akash Kwatra d5aa7717aa
Remove experiment property from abstract class (#11603)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-03 01:51:34 +00:00
Rohit Gupta ee049e123d
Fix rich progress bar metric render on epoch end (#11689)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-02-03 01:43:48 +00:00
jjenniferdai ec1379da2c
Rename `_SupportsStateDict` --> `_Stateful` Protocol (#11469) 2022-02-02 23:45:59 +01:00
Carlos Mocholí b8e360dafa
[CLI] Fix bug that forces overriding `configure_optimizers` (#11672) 2022-02-02 22:44:00 +00:00
Nithin Rao b8d2c65a37
Set the state before saving "last" or "none" checkpoints (#11481)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-02-02 23:07:05 +01:00
Carlos Mocholí d7944a13cd
Teardown all internal components on exception (#11620) 2022-02-02 21:10:19 +00:00
Rohit Gupta 3eee8f18cf
Sort simple profiler summary based on mean duration (#11671) 2022-02-02 20:44:42 +00:00
Rohit Gupta 76175217e4
Fix val_loop run on restart (#11552)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-02 20:19:34 +00:00
Carlos Mocholí a44881cd90
Changes in preparation to #8578 (#11562) 2022-02-02 19:57:08 +00:00
Carlos Mocholí 075b8801c9
Fix checkpoint values when saving and resetting the tuner state (#11518) 2022-01-20 18:54:40 +00:00
Carlos Mocholí 7295457a7b
[CLI] Save only the configuration used (#11532) 2022-01-20 12:35:43 +00:00
Rafał Jankowski e78d658c8d
Remove access to `_short_id` in NeptuneLogger (#11517) 2022-01-20 12:07:42 +00:00
ananthsub 1bd6fc979e
Remove `Strategy.on_tpu` property (#11536) 2022-01-20 08:25:26 +01:00
ananthsub f41d1e5e5e
Remove `Strategy.on_gpu` (#11537) 2022-01-19 21:27:12 +00:00
Rohit Gupta f7f835fa0e
improve simple profiler output (#11414) 2022-01-18 19:58:34 +00:00
Carlos Mocholí 62818dbace
Use a dataclass as the scheduler config (#11443) 2022-01-18 20:23:32 +01:00
Carlos Mocholí 344ab1e0a5
Move the `lightning_optimizers` ownership to the `Strategy` (#11444) 2022-01-18 12:58:56 +01:00
Rohit Gupta 033dba1494
Disable attaching samplers when using `IterableDataset` (#11507) 2022-01-17 23:33:57 +01:00
Gautam R Gare ef4677ae7b
Change the default `prog_bar=False` to `True` in `LightningModule.log_grad_norm` (#11472)
* Reset on_step flag to True in log_grad_norm
* updated change log

Co-authored-by: Aki Nitta <nitta@akihironitta.com>
2022-01-18 02:34:50 +09:00
Rohit Gupta 3230ef8306
Update incorrect entries in changelog (#11501) 2022-01-17 14:59:23 +00:00
Carlos Mocholí 18bbb39eef
Set `Loop.restarting` recursively (#11442)
* Set `Loop.restarting` recursively
* Docs
* CHANGELOG
* Update pytorch_lightning/loops/epoch/training_epoch_loop.py
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
2022-01-14 19:25:23 +09:00
Carlos Mocholí f5bbc2cf17
Avoid in-place ops during logging result updates (#11401)
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2022-01-12 09:09:36 +01:00
Aki Nitta 8dc36c3745
Fix inconsistent exceptions raised with no `rich` installed (#11360)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-01-12 03:55:51 +00:00
Rohit Gupta 82c8875f33
Add `LightningModule.lr_scheduler_step` (#10249)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-01-12 03:53:49 +00:00