Carlos Mocholí
1dd61e4e35
Extend support for logging a collection ( #7771 )
2021-06-01 12:51:50 +01:00
Jirka Borovec
9a001fea22
update NGC docker ( #7787 )
2021-06-01 12:11:29 +02:00
Jirka Borovec
0b6fd1da54
Update pre-commit and add new hooks ( #7781 )
...
* update precommit
* Update .pre-commit-config.yaml
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-01 07:43:50 +02:00
Carlos Mocholí
0dd6d3a798
Avoid adding `None` loss values in `training_epoch_end` ( #7772 )
2021-05-31 19:28:28 +00:00
Adrian Wälchli
7e6010fc93
fix info message when max training time reached ( #7780 )
...
* call time_elapsed
* elapsed formatting
* format
* update test
* changelog
2021-05-31 14:50:16 +02:00
Carlos Mocholí
d47173bb72
Use typing forward references ( #7770 )
...
* Use typing forward references
* Update pytorch_lightning/core/lightning.py
2021-05-31 09:54:28 +02:00
Carlos Mocholí
a69beab499
Clean existing logging tests ( #7760 )
...
* Remove dev debugger metric tracking
* Fix tests
* Fix test
* Import
* Clean logging tests
* flake8
* Docstring
2021-05-30 16:36:52 +02:00
Carlos Mocholí
fa8f0363ee
Some test updates ( #7761 )
...
* Some test updates
* flake8
2021-05-30 13:15:25 +02:00
Carlos Mocholí
5f0863e5e5
Organize trainer properties ( #7758 )
...
* Organize trainer properties
* Single quote
* Double quote
2021-05-30 13:09:01 +02:00
Carlos Mocholí
bc3238be8c
Remove metric tracking from dev debugger ( #7759 )
...
* Remove dev debugger metric tracking
* Fix tests
* Fix test
* Import
* Fix tests
* Fix test
* flake8
* Fix tests
2021-05-30 12:03:42 +02:00
Justus Schock
5fc6f065bf
Add Test for memory consumption ( #7733 )
...
* Add Test to ensure Training Batch is no longer in GPU memory when running validation
* Add Test to ensure Training Batch is no longer in GPU memory when running validation
* Add Test to ensure Training Batch is no longer in GPU memory when running validation
* Temporary disable other tests
* Verbose asserts
* Verbose asserts
* update tests + revert to original ci
* Update test_evaluation_loop.py
* Update test_evaluation_loop.py
* Update tests/trainer/loops/test_evaluation_loop.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Update tests/trainer/loops/test_evaluation_loop.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: justusschock <justus.schock@psoteo.de>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-05-28 19:40:23 +05:30
Mauricio Villegas
f6b5e3df57
Added save_config_filename init argument to LightningCLI ( #7741 )
2021-05-28 09:30:16 +02:00
Boris Dayma
9097347ea8
feat(wandb): log models as artifacts ( #6231 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-27 20:15:02 +02:00
Carlos Mocholí
9304c0df8f
Rename and move Result ( #7736 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-05-27 12:27:52 +00:00
Carlos Mocholí
906c067b07
Update hooks pseudocode ( #7713 )
2021-05-27 12:27:26 +02:00
Kaushik B
04dcb1786d
Add `__len__` method to IndexBatchSamplerWrapper ( #7681 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-05-26 18:20:13 +02:00
Kaushik B
b1a7b7e9bf
Add `tpuvm` section in TPU docs ( #7714 )
2021-05-26 12:41:00 +00:00
Carlos Mocholí
311d9fe67e
Always run validation inside the training loop epoch ( #7357 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-26 14:26:48 +02:00
Tomy Hsieh
037a71b156
Update README.md ( #7717 )
2021-05-26 12:58:11 +02:00
Aki Nitta
71c1017092
Update sphinx version to 4.0 or later ( #7716 )
2021-05-26 11:33:24 +02:00
Kaushik B
27eb0035ca
Increase TPU Check timeout ( #7706 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-05-26 01:44:29 +00:00
Carlos Mocholí
d26953c8bc
Add `ModelPruning(prune_on_train_epoch_end)` to choose when to apply pruning ( #7704 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-26 00:57:56 +02:00
Aki Nitta
b2d77a6798
CI: Reset cache weekly ( #7686 )
...
* Reset cache weekly
* Update ci_test-base.yml
* Update docs-checks.yml
* Update ci_test-mnodes.yml
* Update release-pypi.yml
* Remove if latest
2021-05-25 22:53:38 +00:00
Xinyao(Alvin) Sun
7e2f7e956b
fix: improve UserWarning message ( #7685 )
...
* fix: improve UserWarning message
when both overfit and training dtaloader shuffling are enabled
fixes issue: #7656
* chore: update changelog
* Polish userwarning msg in pytorch_lightning/trainer/data_loading.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* shuffling typo
* Update CHANGELOG.md
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-05-25 17:35:15 +00:00
Kaushik B
e7057d5898
Add `should_rank_save_checkpoint` property to Training Plugins ( #7684 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-05-25 23:02:05 +05:30
Carlos Mocholí
3e5d6e906a
Remove on epoch guard from the should stop validation check ( #7701 )
2021-05-25 17:16:46 +02:00
Carlos Mocholí
565b62f11a
Remove on epoch guard from the should stop validation check ( #7701 )
...
* Remove on epoch guard from the should stop validation check
* Formatting
2021-05-25 16:03:55 +01:00
Carlos Mocholí
ffe2cbeda1
Remove on epoch guard from the should stop validation check ( #7701 )
...
* Remove on epoch guard from the should stop validation check
* Formatting
2021-05-25 16:03:43 +01:00
Carlos Mocholí
a1c40f3207
Remove on epoch guard from the should stop validation check ( #7701 )
...
* Remove on epoch guard from the should stop validation check
* Formatting
2021-05-25 15:59:42 +01:00
Carlos Mocholí
e2ead9abd7
Refactor some loops code and hook tests ( #7682 )
2021-05-25 13:27:54 +02:00
Carlos Mocholí
8ba6304c73
Increment the total batch idx before the accumulation early exit ( #7692 )
...
* Increment the total batch idx before the accumulation early exit
* Update CHANGELOG
2021-05-25 10:23:40 +02:00
Carlos Mocholí
fe1c4ca273
Move test_hooks.py code ( #7689 )
2021-05-24 22:26:32 +00:00
Kaushik B
2c10ecc232
MAINTAINER has been deprecated ( #7683 )
2021-05-25 00:01:31 +05:30
Jirka Borovec
ad168fc4c6
chlog for 1.3.2 + legacy test ( #7676 )
2021-05-24 17:55:02 +00:00
Carlos Mocholí
8b01497e42
Fix global step update when the epoch is skipped ( #7677 )
...
* Fix global step update when the epoch is skipped
* Update CHANGELOG
* Move test
2021-05-24 17:36:56 +01:00
Kaushik B
3f460b150a
Move parameter validation specific to TPU Training plugins ( #7415 )
...
* Move parameter validation specific to TPU Training plugins
* update docstring
2021-05-24 16:02:01 +00:00
ananthsub
fa41c588f4
Remove ProfilerConnector class ( #7654 )
...
* Remove ProfilerConnector class
* Update trainer.py
* Update CHANGELOG.md
* Update trainer.py
* Update trainer.py
* tests
2021-05-24 08:58:15 -07:00
Gyeongjae Choi
a54bc5dba3
Fix progress bar print error when called before training ( #7674 )
...
* Check progress bar existence before printing
* Add tests for predict_progres_bar
* Add tests for progress_bar printing without training
* Update changelog
2021-05-24 17:33:28 +02:00
Carlos Mocholí
2103b5efc9
Move sync code from step result to lightning module [6/n] ( #7651 )
2021-05-24 13:13:55 +01:00
Xinyao(Alvin) Sun
0c958c5a1f
Fix dataloaders are not reset when tuning the model ( #7566 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-05-24 10:21:45 +02:00
shuyingsunshine21
299f2c481b
FSDP with full state dict ( #7487 )
...
* Fix some test errors
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* checkpoint consolidation
* Update ddp_spawn.py
* Update test_metric_result_integration.py
* Update test_results.py
* Update utils.py
* Update utils.py
* Update test_all_gather_grad.py
* Update test_all_gather_grad.py
* Update test_results.py
* Revert "Update test_results.py"
This reverts commit 9d4a2b891d
.
* Revert "Merge pull request #1 from shuyingsunshine21/shuyingsunshine21-checkpoint_consolidate"
This reverts commit c5053da789
, reversing
changes made to 0d23d75bc9
.
* Revert "Update test_all_gather_grad.py"
This reverts commit 0d23d75bc9
.
* Revert "Update utils.py"
This reverts commit 70fe5da9c6
.
* Revert "Update utils.py"
This reverts commit a9aae99f6e
.
* Revert "Update test_results.py"
This reverts commit ea74906878
.
* Revert "Update test_metric_result_integration.py"
This reverts commit bf70e431b3
.
* Revert "Update ddp_spawn.py"
This reverts commit f17210183b
.
* Revert "checkpoint consolidation"
This reverts commit 536c1323b0
.
* Revert "Revert "checkpoint consolidation""
This reverts commit 3a9fde915a
.
* Revert "Revert "Revert "checkpoint consolidation"""
This reverts commit 7a369f47e1
.
* Revert "Revert "Update ddp_spawn.py""
This reverts commit 8222dc98ea
.
* Revert "Revert "Update test_metric_result_integration.py""
This reverts commit 6c095b2370
.
* Revert "Revert "Update test_results.py""
This reverts commit 250d0aaaa2
.
* Revert "Revert "Update utils.py""
This reverts commit 8651d54d79
.
* Revert "Revert "Update test_all_gather_grad.py""
This reverts commit dcdcd29731
.
* modify distributed environment to make test pass
* fix version for ddp plugin test
* fix
* fix
* changelog
* Update CHANGELOG.md
* fsdp with full state dict
* fix missing import
* modify unitest
* fix
* fix
* fix typo
* modify test and add changelog
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* limit max_epoch to 1 for testing
* test
* fix
* update
* testing remove special for multi gpu
* assert gpu
* add assertion for gpu
* fix
* Re-enable special test, use ModelCheckpoint
* Fix paths
* Fix path passing
* test
* test
* fix test
* fix
* pre-commit format
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: SeanNaren <sean@grid.ai>
2021-05-24 08:11:45 +01:00
Xinyao(Alvin) Sun
01109cdf0c
Fix/mismatched toggle optimizer ( #7563 )
...
* fix: avoid potential mismatched toggling of optimzier
Refs #7405
chore: update CHANGELOG
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
fix: resolve a confict
chore: update changelog
* feat: add a test that fails in master
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix typo in tests/trainer/optimization/test_multiple_optimizers.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Polish tests/trainer/optimization/test_multiple_optimizers.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Polish tests/trainer/optimization/test_multiple_optimizers.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* fix: change placeholder in optimizer_step from positional args to keyword args
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-05-23 04:30:28 +02:00
shuyingsunshine21
2242423b75
refactor accelerator teardown -> training type plugin teardown ( #7579 )
2021-05-22 13:19:24 -07:00
Carlos Mocholí
a8d9b5f783
Remove tbptt `self.log` flags and other dead code [5/n] ( #7644 )
2021-05-22 01:13:00 +00:00
Carlos Mocholí
33a1f5271f
[2/N] Define dataclasses for progress tracking ( #7574 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-05-22 03:09:08 +02:00
Carlos Mocholí
110e49dc99
De-duplicate `DistributedSampler` mentions ( #7636 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-05-21 23:01:13 +02:00
Yifu Wang
8d6e2ff7b2
Improve argument validation for validate(), test(), and predict() ( #7605 )
...
Co-authored-by: Yifu Wang <yifuwang@2012@gmail.com>
2021-05-21 09:03:16 -07:00
Carlos Mocholí
e16d4fbdee
CI code cleaning ( #7615 )
2021-05-21 11:35:12 +00:00
ananthsub
f6d892ac21
[feat] Support custom filesystems in LightningModule.to_torchscript ( #7617 )
...
* [feat] Support custom filesystems in LightningModule.to_torchscript
* Update CHANGELOG.md
* Update test_torchscript.py
* Update test_torchscript.py
* Update CHANGELOG.md
* Update test_torchscript.py
2021-05-21 11:23:15 +00:00
Carlos Mocholí
e8a46bee15
Remove `Result(minimize)` parameter [4/n] ( #7628 )
2021-05-21 12:58:52 +02:00