Commit Graph

1197 Commits

Author SHA1 Message Date
Jirka Borovec b434c479e7
Quantisation (#5706)
* empty

* sq

* obs


* int

* ts

* helpers

* chlog

* yapf

* avg

* dupl

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* fixes

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* fixes

* note

* warn

* 45

* link

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* yapf

* flake8

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-02-11 07:04:57 -05:00
Jirka Borovec 9475c845cb
Docs/fixes (#5914)
* wip

* ..

* ...

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-02-11 10:22:07 +00:00
Carlos Mocholí e8190e8848
Convert progress bar metrics to float (#5692)
* MetricsHolder(to_float=True)

* Update CHANGELOG

* Update tests/callbacks/test_progress_bar.py

* flake8

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-10 19:16:53 -05:00
chaton 7b00894130
[feat] Add StochasticWeightAveragingCallback (#5640)
* add swa callback

* switch back to 1.6.0

* remove optimizer_step

* move super

* update

* forgot update_parameters

* update on comments

* works for ddp

* resolve flake8

* remove set_model

* resolve flake8

* resolve cpu

* resolve flake8

* resolve flake8

* update

* update on comments
2021-02-11 00:05:59 +00:00
Carlos Mocholí a028171f26
Fix Pruning callback and add a few features (#5825)
* Remove pruning check because it was added in 1.4.0 and that is our minimal torch version

* Fixing many bugs

* Fix misconfig test

* Fix tests

* Improve error message

* Reduce whitespace

* WIP

* TODOs

* _MODULE_CONTAINERS

* Add LTH test

* Allow resampling

* Iterative pruning

* Log pruning percentage

* Properly make pruning permanent

* Fix docstring

* Minor changes

* Test loading non-permanent model

* corrent bugs

* Revert "corrent bugs"

This reverts commit ffb8d47547.

* Add beta warning

* Fix docs

* 2 verbosity levels

* OCD

Co-authored-by: Your Name <you@example.com>
2021-02-10 15:03:23 +00:00
Jirka Borovec c2c82dad62
CI: Azure (#5882)
* add base Azure pipeline

* skip
2021-02-10 04:43:26 -05:00
ananthsub d26702bd66
Enable purely iteration-based training (#5726)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Kaushik Bokka <kaushikbokka@gmail.com>
2021-02-10 08:51:08 +00:00
Jirka Borovec 9dd56398e3
fixing some compatibility with PT 1.8 (#5864)
* change default

* .

* p

* 0.21.2

* .

* fix

* .
2021-02-09 18:25:57 +01:00
Jirka Borovec a0f7831278
fix miss-leading imports in tests (#5873)
* fix imorts

* .
2021-02-09 05:10:52 -05:00
rohitgr7 bcb6ee5d51 sync 2021-02-08 20:22:39 +01:00
Rohit Gupta cb67e1d0b2 Separate epoch validation from step validation (#5208)
* Seperate epoch validaton from step validation

* update system

* test

* baked logic in callbacks

* unbake logic in callbacks

* fix the call for scheduler

* use property

* pep

* correct rebase

* gitignore

* ref

* add tests

* fix

* add early stopping test

* trigger

* chlog

* rev

* 1.3

* log

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update pytorch_lightning/trainer/training_loop.py

* Update CHANGELOG.md

* Apply suggestions from code review

Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

(cherry picked from commit e429f97b67)
2021-02-08 20:22:39 +01:00
Jirka Borovec e7c6e9d43d prepare v1.1.8 (#5839)
(cherry picked from commit 3b7afb932b)
2021-02-08 20:22:39 +01:00
Adrian Wälchli 3ad55a2f09
Fix fine-tuning callback test (#5643)
* fix

* batch size
2021-02-08 12:19:34 -05:00
Jirka Borovec bd920b4102
Refactor simplify tests (#5861)
* add new

* restructure

* yapf

* move

* fix
2021-02-08 11:52:02 +01:00
Jirka Borovec 42812bb003
prune SimpleModel (#5862) 2021-02-08 09:52:54 +01:00
Jirka Borovec 26bc754cc1
prune unused methods (#5860) 2021-02-08 02:31:44 +01:00
Jirka Borovec a53c6d1319
yapf tests metrics (#5845) 2021-02-06 11:41:40 -05:00
Jirka Borovec ec742310d4
yapf tests trainer (#5844) 2021-02-06 10:06:17 -05:00
Jirka Borovec 82943515dc
formatting tests1/n (#5843)
* utils

* tuner

* base
2021-02-06 08:22:10 -05:00
Jirka Borovec 91f63deabc
formatting tests: 5/5 (#5848)
* cb

* acc

* plug

* .
2021-02-06 07:28:26 -05:00
Jirka Borovec 4faaef7758
formatting tests: 4/n (#5846)
* models

* ckpt

* core

* log
2021-02-06 12:07:26 +01:00
Jirka Borovec f83cca6107
formatting flake8 & isort (#5824)
* formatting

* isort

* make

* yapf

* isort
2021-02-05 18:33:12 -05:00
tchaton 77be6f6e24 resolve conflits
resolve doc

boring commit

docs

torchvision

tpu

Update dockers/tpu-tests/tpu_test_cases.jsonnet

Update dockers/tpu-tests/tpu_test_cases.jsonnet
2021-02-05 21:43:10 +01:00
ananthsub 06f65938ef Fix toggle optimizer (#5775)
* Update lightning.py

* update changelog

* add a 3 optimizer test

* resolve flake8

* remove extra code

* typo

* resolve typo

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-05 21:43:10 +01:00
Kaushik B 5dfd62c09e Disable training with zero num_training_batches when insufficient limit_train_batches (#5703)
* disable training when zero num_train_batches with limit_train_batches

* refactor train skip condition

* fix formatting issues

* fix formatting issues

* ref: test error msg

* fix tests for data loader calls

* fix train dataloader condition

* update limit_train_batches upper range in test comment

* remove model state check test

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-05 21:40:42 +01:00
Rohit Gupta 2abf4693bc Fix log_dir property (#5537)
* fix and update tests

* update with ModelCheckpoint

* chlog

* wip wandb fix

* all fixed

Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-05 21:40:42 +01:00
noamzilo 84a8d2d178 Bugfix/5487 auto lr ordering (#5638)
* started to write failing test. just getting into the framework...

* started to write failing test. just getting into the framework...

* added failing test for misconfiguration of lr finder

* made test startup quickly. making sure without the fix it also fails slowly

* improved test

* fixed for linter

* fixed for linter

* yet another fix for the linter

* yet another fix for the linter

* fixed comment by @carmocca

* fixed comment by @carmocca

* Fix test

* chlog

* Apply suggestions from code review

* Fix test

* Update pytorch_lightning/tuner/lr_finder.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update pytorch_lightning/tuner/lr_finder.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/trainer/test_lr_finder.py

* Update pytorch_lightning/tuner/lr_finder.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update pytorch_lightning/tuner/lr_finder.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update pytorch_lightning/tuner/lr_finder.py

* Update tests/trainer/test_lr_finder.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-02-05 21:40:40 +01:00
Sumanth Ratna 1c44f35cf3 Fix mypy 0.800 plus when prepending $PYTHONPATH to sys.path (#5698)
* Fix mypy when prepending $PYTHONPATH to sys.path

* attempt mypy fix

* Revert "attempt mypy fix"

This reverts commit fb7ed827d9.

* fix mypy

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-05 21:40:40 +01:00
James Guillochon 4bf6dd122a Close SummaryWriter in TensorBoardLogger on finalize (#5696)
Not entirely sure this is the "right" solution to this problem, but currently when model fitting is finished the `TensorBoardLogger` attribute `_experiment` (a `SummaryWriter`) is left with an open file handle. This causes issues in particular on Windows systems (and probably others), and also makes the files un-syncable on cloud-synced devices like OneDrive. This PR adds a `close()` to `finalize` to make sure this handle is closed upon fit completion.

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-02-05 21:40:40 +01:00
Adrian Wälchli bb7d188318 Fix ModelCheckpoint race condition in file existence check (#5155)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-02-05 21:40:39 +01:00
Nicki Skafte 605c5a8c9a Fix `num_classes` arg in F1 metric (#5663)
* fix f1 metric

* Apply suggestions from code review

* chlog

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-05 21:40:30 +01:00
Adrian Wälchli 1feff5d774 move progress bar test to correct test folder (#5667) 2021-02-05 21:40:29 +01:00
chaton d8f2d8e15a
[Feat-BugFix] Resolve custom DataLoader (#5745)
* resolve custom dataloader

* update changelog

* fix tests

* update on comments

* resolve comments

* add support for custom batch_sampler

* Update tests/trainer/test_data_loading.py

* resolve test

* resolve flake8

* resolve yapf

Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-02-05 09:03:18 +00:00
Jirka Borovec d2c2e5004d fix tests 2021-02-04 20:55:58 +01:00
Jirka Borovec e633787a3d flake8 + yapf 2021-02-04 20:55:58 +01:00
Swetha Mandava c62f68c7cd passing batch outputs to on_train_batch_end (#4369)
* passing batch outputs to on_train_batch_end

* styling

* updating epoch end logic

* also condition on on_train_epoch_end hooks

* more readable

* pep8

* pep8

* readability suggestion accepted

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* adding test_training_epoch_end_metrics_collection_on_override test

* fix formatting

* fix formatting

Co-authored-by: Swetha Mandava <smandava@nvidia.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>

(cherry picked from commit 5fcca4e43b)
2021-02-04 20:55:41 +01:00
Ryan Nett da5ba50727 Unify attribute finding logic, fix not using dataloader when hparams present (#4559)
* Rebase onto master

* indent fix

* Remove duplicated logic

* Use single return

* Remove extra else

* add `__contains__` to TestHparamsNamespace to fix tests

* Fix lightning_setattr to set all valid attributes

* update doc

* better names

* fix holder order preference

* tests for new behavior

* Comment about using the last holder

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit eee3b1a284)
2021-02-04 20:55:41 +01:00
manipopopo 97e1516349 Fix Metric.state_dict (#5614)
* Fix Metric.state_dict

* Update CHANGELOG.md

* Update CHANGELOG.md

* Detach tensors in a list if needed

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

(cherry picked from commit e87424adfb)
2021-02-04 20:55:41 +01:00
Piotr Jander 3ca1fbbf49 Ignore `step` param in Neptune logger's log_metric method (#5510)
* Ignore `step` param in Neptune logger's log_metric method

The `step` parameter is ignored because Neptune requires strictly increasing step values, a condition which is sometimes violated in Lighting e.g. when `fit()` and `test()` are called one after another on some models. `step` could be enabled again once Lightning guarantees that step values are always strictly increasing.

Also a minor bugfix: the `log_text()` method should use Neptune's `log_text()` method.

* Update neptune.py

* Update test_neptune.py

* Update test_all.py

* fix neptune tests

* add chlog

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
(cherry picked from commit 5d76b31881)
2021-02-04 20:55:41 +01:00
Adrian Wälchli b3b48c188c fix error when logging to progress bar with reserved name (#5620)
* warn about duplicate metrics

* update changelog

* suggestions from rohit

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* multiple values in message

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-04 20:55:41 +01:00
Lezwon Castelino b95471d4a4 Increase TPU check timeout (#5598)
* change timeout to 100

* add to CHANGELOG.md

* update test

* updates

* reduce TPU_TIMEOUT_CONSTANT during test

* Update tests/utilities/test_xla_device_utils.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* patch TPU_TIMEOUT_CONSTANT

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-04 20:55:41 +01:00
Philipp Singer 59361d595a fix Neptune logger creating multiple experiments when gpus > 1 (#3256)
* DP device fix

* potential fix

* fix merge

* update tests

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-04 20:55:40 +01:00
chaton e8206a9295 Mnodes (#5020)
* add a multi-nodesworkflow

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-04 20:55:40 +01:00
Carlos Mocholí 5ff1306582 Add new CHANGELOG section (#5580) 2021-02-04 20:55:40 +01:00
chaton e425bf3ba9
[BugOnFeat] Resolve bug with Finetuning (#5744)
* resolve bug + add doc

* Update pytorch_lightning/callbacks/finetuning.py

* resolve bug

* start adding more test

* add more tests for finetuning callback functions

* rename to flatten_modules

* resolve doc

* Update pytorch_lightning/callbacks/finetuning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* resolve comments

* remove update on BoringModel

* update on comments

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-02-04 18:36:54 +00:00
Lexie Troiano d05cdf83f1 Merge remote-tracking branch 'carmocca/sync-1.1.5' into release/1.2-dev 2021-02-04 09:42:59 -05:00
Kaushik B 26cc3b5357
Change the seq of on_train_batch_end, on_batch_end & on_train_epoch_end, on_epoch_end hooks (#5688) 2021-02-04 18:30:20 +05:30
Adrian Wälchli 9555043a29
Force ModelCheckpoint callback to run last (#5731) 2021-02-03 16:40:57 -05:00
rohitgr7 a37416843b Fix sync
resolve wrong merge

tpu

yapf
2021-02-03 20:11:35 +01:00
Carlos Mocholí b7920b1c84 Fix logging on_train_batch_end in a callback with multiple optimizers (#5521)
* Start with the failing test

* Then fix the failing test

* Update CHANGELOG
2021-02-03 19:41:46 +01:00