Sean Naren
49df107bdd
[docs] Update FSDP instructions and add DeepSpeed evaluate/predict example ( #8713 )
2021-08-04 15:21:30 +00:00
Thien Tran
052aefc342
WandbLogger to log model topology by default ( #8662 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-08-04 10:36:57 +00:00
Sean Naren
560a5c3fc5
Add functions to collate deepspeed zero 3 checkpoints ( #8701 )
2021-08-04 09:39:02 +00:00
Caleb Robinson
9ca02f58ae
Fix an import deprecation warning ( #8687 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-08-03 22:17:28 +00:00
samlurye
f90849cc95
Deprecate LightningModule.summarize() in favor of pl.utilities.model_summary.summarize() ( #8513 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-08-03 22:08:51 +00:00
Elad Segal
08fba96b6c
Add `batch_size`, `rank_zero_only` arguments for `log_dict` to match `log` ( #8628 )
2021-08-03 22:05:34 +00:00
Sean Naren
98319f83bf
Reduce title length ( #8709 )
2021-08-03 23:17:10 +02:00
Jirka Borovec
0e6ee9c39d
CI: add mdformat ( #8673 )
...
* add mdformat
* exclude chlog
* fix ***
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-08-03 18:19:09 +00:00
Sean Naren
49d03f87fe
[docs] Update deepspeed docs, add some more information and link to streamlit ( #8691 )
2021-08-03 16:12:36 +00:00
Sean Naren
a1be6217ce
Expand the use cases, move them up for discoverability ( #8692 )
2021-08-03 11:47:20 +00:00
Daniel Stancl
08ac079c2f
Fix mypy typing for `utilities.cloud_io.py` ( #8671 )
...
Co-authored-by: tchaton <thomas@grid.ai>
2021-08-03 11:56:28 +02:00
Isaac
8274183bf2
Add check for unique device ids ( #8666 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-08-03 08:18:51 +00:00
Sean Naren
e5d9e21dea
Fix save/load/resume from checkpoint for DeepSpeed Plugin ( #8397 )
2021-08-02 22:31:05 +00:00
Kaushik B
d01d8334b5
Fix `ddp` accelerator choice for cpu ( #8645 )
...
* Fix ddp accelerator choice for cpu
2021-08-02 21:24:07 +00:00
thomas chaton
dd8216a6b8
Save the `ResultCollection` in the loops state dict ( #8641 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-08-02 20:52:24 +00:00
thomas chaton
567e905ead
update logic to inject FastForwardSampler / CaptureIterableDataset 2/n ( #8366 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-08-02 20:52:06 +00:00
thomas chaton
15fb32037d
Test `metric_attribute` for different children module structures ( #8675 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-08-02 20:51:15 +01:00
thomas chaton
9e61de2063
Torch Elastic DDP DeadLock bug fix ( #8655 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-08-02 21:48:43 +02:00
Carlos Mocholí
d83dd7969d
Disable recurrent events on forks ( #8668 )
2021-08-02 18:12:13 +00:00
Jirka Borovec
661522e173
black: magic trailing comma ( #8560 )
2021-08-02 20:02:36 +02:00
Carlos Mocholí
ca96b2d23e
Delete deprecated save function ( #8680 )
2021-08-02 19:28:31 +02:00
Jirka Borovec
f67892ea96
CI: yesqa ( #8564 )
...
* add yesqa
* fix flake8
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-08-02 16:05:56 +00:00
Jirka Borovec
66cc505339
update NGC ( #8652 )
...
* update NGC
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-08-02 16:05:36 +00:00
Carlos Mocholí
cf0d362658
Delete deprecated `TrainerTrainingTricksMixin` ( #8679 )
2021-08-02 18:00:32 +02:00
Carlos Mocholí
d187008e84
Un-skip some Horovod tests ( #8676 )
2021-08-02 17:54:05 +02:00
Kaushik B
850416f0a0
Fix distributed types support for CPUs ( #8667 )
2021-08-02 16:42:28 +05:30
thomas chaton
85bba06529
update ( #8674 )
2021-08-02 11:56:09 +02:00
Sean Naren
7a1e97203e
Add property to skip restoring optimizers and schedulers via plugin ( #8644 )
2021-07-31 10:08:10 +02:00
Daniel Stancl
1f01db8b30
Fix mypy in utilities.argparse ( #8124 )
...
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-07-30 16:36:55 +00:00
Adrian Wälchli
16392a7de7
Update links for `zero_grad` to PyTorch docs ( #8618 )
2021-07-30 16:09:36 +02:00
Wei Ji
a78709751a
Reverse width, height to height, width in docs ( #8612 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-07-30 13:56:17 +00:00
Rio H
ba8053492f
Deprecate LightningModule.model_size ( #8495 )
...
Co-authored-by: Caleb Robinson <calebrob6@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-07-30 13:53:40 +00:00
Adrian Wälchli
529c42f848
fix collecting training_step outputs ( #8613 )
2021-07-30 13:03:15 +00:00
Carlos Mocholí
5789e9f5e4
Fix reference issues during epoch end result collection ( #8621 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-07-30 12:16:47 +00:00
Carlos Mocholí
93784da2c3
Fix pre-commit blacken-docs failures ( #8624 )
2021-07-30 12:10:15 +00:00
Adrian Wälchli
1bc052c290
Remove dead code in eval loop output tracking ( #8625 )
2021-07-30 14:04:51 +02:00
Carlos Mocholí
bb4887368c
Docs improvements around hparams ( #8577 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-07-30 11:06:03 +00:00
Carlos Mocholí
9720e264f5
Fix references for `ResultCollection.extra` and improve `str` and `repr` ( #8622 )
2021-07-30 12:47:34 +02:00
Sean Naren
07b7dc9c17
[Fix] Add delay property for checkpointing, refactor loading checkpoint (DeepSpeed Checkpointing Fix 1/n) ( #8627 )
...
* Add property to delay checkpointing, move loading checkpoint file into the run function to allow deepspeed engine to be loaded
* Add a small test
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update pytorch_lightning/accelerators/accelerator.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Address review
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-30 11:31:08 +01:00
Adrian Wälchli
b6ea6373dd
exclude mpi run from auto-detection of horovod ( #8610 )
2021-07-30 12:01:00 +02:00
Carlos Mocholí
c99e2fe0d2
Test `Callback.on_load_checkpoint` order ( #8588 )
2021-07-29 12:28:29 +02:00
Adrian Wälchli
7901d297d3
remove support for optimizer_idx in the training_step for manual optimization ( #8576 )
2021-07-29 08:30:45 +00:00
Kaushik B
9c80727b8c
Add ddp_cpu to DistributedType Enum ( #8596 )
2021-07-29 10:02:32 +02:00
Carlos Mocholí
c2199fbbee
Fix `trainer.fit_loop.split_idx` reference ( #8601 )
...
* Fix split idx reference
* Update CHANGELOG
* Add comment
2021-07-29 08:00:04 +00:00
Carlos Mocholí
0dc0472e1f
Use class name in SWA info message ( #8602 )
2021-07-29 09:39:46 +02:00
Carlos Mocholí
ebd2e87752
Delete deprecated `TrainerLoggingMixin` ( #8609 )
...
* Delete deprecated `TrainerLoggingMixin`
* Update CHANGELOG
* Delete from Trainer
2021-07-29 08:39:16 +02:00
Adrian Wälchli
8c27fa71fa
[1 / 3] improvements to saving and loading callback state ( #6886 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-29 00:12:32 +02:00
Jirka Borovec
0c0b24c031
Prune deprecated metrics ( #8586 )
...
* drop metrics
* drop tests
* fix imports
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-28 16:57:31 +00:00
Carlos Mocholí
47c47faeae
Remove `outputs` in `on_train_epoch_end` hooks ( #8587 )
2021-07-28 18:27:54 +02:00
Jirka Borovec
470842f5c8
CI: validate JSON & fix benchmark ( #8567 )
...
* CI: validate JSON
* as GHA
* PT1.8
* 32g
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-07-28 18:09:15 +02:00