Commit Graph

5070 Commits

Author SHA1 Message Date
Mauricio Villegas b2e9fa814f
Improvements related to save of config file by LightningCLI (#7963)
* - Exclude SaveConfigCallback for fast_dev_run=True.
- SaveConfigCallback give a clearer message if config file already exists.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* - Added unit test
- Added entry in changelog
- Improved save config docstring

* Fix log line

* Fixes

* Fix changelog entry

* Update pytorch_lightning/utilities/cli.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Suggested fixed change

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-06-15 23:26:39 +02:00
Adrian Wälchli 971908a1aa
Loop Refactor 1/N - Training Loop (#7871)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Justus Schock <justus.schock@rwth-aachen.de>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-06-15 12:55:06 +00:00
Carlos Mocholí 560b1970af
Standardize positional datamodule and argument names (#7431)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-06-15 11:50:13 +00:00
Sean Naren 0974d66c6c
Add docs for IPUs (#7923)
* Added base docs for IPUs

* Fix

* Add details around poptorch profiler and model parallelism

* more description

* Add image

* Clearer messaging

* Cleanup

* Better name

* Add note

* Add some details around device iterations and model parallelism

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Add a small install comment

* Add clip gradients not supported

* Update docs/source/advanced/ipu.rst

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>

* Add note

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-06-15 10:16:47 +00:00
Sean Naren 024cf23c67
Remove convert_to_half, suggest using `model.half` (#7974) 2021-06-14 18:48:02 +01:00
Sean Naren f7459f5328
DeepSpeed Infinity Update (#7234)
* Update configs to match latest API

* Ensure we move the entire model to device before configure optimizer is called

* Add missing param

* Expose parameters

* Update references, drop local rank as it's now infered from the environment variable

* Fix ref

* Force install deepspeed 0.3.16

* Add guard for init

* Update pytorch_lightning/plugins/training_type/deepspeed.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Revert type checking

* Install master for CI for testing purposes

* Update CI

* Fix tests

* Add check

* Update versions

* Set precision

* Fix

* See if i can force upgrade

* Attempt to fix

* Drop

* Add changelog

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-14 16:38:28 +00:00
Carlos Mocholí 03e7bdf8d5
Improve `LightningModule` hook tests (#7944) 2021-06-14 18:16:42 +02:00
Dan Dale 3a0ed02bd4
Properly handle parent modules w/ parameters in `BaseFinetuning` callback (#7931)
Co-authored-by: Daniel Dale <dan@distributedinsight.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-06-14 16:01:07 +00:00
Vatsalya Chaubey ce93d8bcfd
Handle errors due to uninitailized parameters (#7642)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-14 15:56:03 +00:00
Jirka Borovec cca0e7535a
remove parsing comments (#7958)
* remove parsing comments
* \s

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-14 14:24:48 +00:00
Eugene Huang 898fb56b16
added on_test_start() documentation (#7962)
Co-authored-by: ehuang68 <>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-06-14 14:19:48 +00:00
Seppo Enarvi 22d826615f
Seed all workers when using DDP (#7942)
* Seed all workers when using DDP

* Fix to dataloader seeding

* Make argument name explicit

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Use f-strings when logging

* Removed a redundant log message

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-14 14:39:50 +01:00
Carlos Mocholí 436fc53c89
Improve `LightningDataModule` hook test and fix `dataloader_idx` argument (#7941)
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-06-14 12:42:13 +00:00
Adrian Wälchli 6b7b40473b
deprecate hpc_load() and integrate it with restore() (#7955)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-14 12:20:01 +00:00
Adrian Wälchli 20a5e09e33
fix myst-parser warning blocking docs ci (#7967) 2021-06-14 11:17:53 +00:00
Jirka Borovec f15ea6015e
update chlog + legacy chpt (#7954)
* update chlog

* legacy
2021-06-13 09:42:49 +05:30
Yuanzheng Wang 59d0c65613
Add dataclass support to `apply_to_collection` (#7935)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-06-12 11:42:49 +00:00
Mauricio Villegas cdd01f32da
LightningCLI support for argument links applied on instantiation (#7895)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-12 13:13:14 +02:00
Max Ehrlich 6856ccedfd
Remove rank_zero_only on DataModule prepare_data (#7945)
Signed-off-by: Max Ehrlich <max.ehr@gmail.com>
2021-06-12 12:50:29 +02:00
Sean Naren 96433d03ea
IPU Integration 5/5 (#7867)
* Initial changes

* Add broken example for now

* Fix reference

* Fix format

* Code runs

* Fixes

* Clear up files

* Add tests, helpers, fixes

* Small cleanups

* Refactors based on review

* Swap to special tests

* Add special tests

* Add source

* Cleanups

* Add logic to attach/detach model from devices

* Fixes for tests

* Fixes for tests

* Move earlier

* Cleanups

* Add check for nvcc

* Add tests, cleanups

* Fix errors

* fix

* Try condition

* Add missing annotation

* Clearer

* Clearer message

* Fix variable

* Cleanups

* Add comment

* CHANGELOG.md

* Add simple selection test

* Remove special=True to see what happens

* Fix test

* Update tests/accelerators/test_ipu.py

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>

* Convert ipu_cores -> ipus

* Add typing, fail earlier

* simplify precision

* Add test, add helper

* fix accum

* Update pytorch_lightning/plugins/training_type/ipu.py

Co-authored-by: thomas chaton <thomas@grid.ai>

* Use stages

* Make sure warning message returned

* thorw error

* Add more tests, use fs

* add comment

* Clean

* Address feedback, add IPU tests

* Fixes

* Fix signature

* Add types

* Remove autoround

* Add docstring

* ipu_cores -> ipus

* Add test, remove unnecessary precision set

* Add optimizer test

* Add precision back with test

* Address code review

* Change to probs

* Move some of the asserts earlier

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-06-11 15:07:04 +00:00
Adrian Wälchli 42c7f2725e
refactor checkpoint loading for training type plugins (#7928)
* plugin loading logic

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* integrate loading for test

* fix

* fix

* unused iport

* Update pytorch_lightning/trainer/connectors/checkpoint_connector.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-11 14:05:11 +01:00
Carlos Mocholí ac4eb0a06a
`is_overridden` improvements (#7918)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-06-11 13:47:00 +02:00
Carlos Mocholí 9e932f4dfd
Delete `on_after_backward` unused argument (#7925) 2021-06-10 17:38:30 -07:00
Burhanuddin Rangwala 8b73869369
Deprecate the default `EarlyStopping` callback monitor value (#7907)
* removed monitor default value and added depreceation message

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* format change

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* requested changes

* added test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* format changes

* typehint change

* Update CHANGELOG.md

* requested changes

* regex

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-06-10 17:33:39 -07:00
Adrian Wälchli c1eac483e9
split `restore_training_state` into logical parts [2 / 2] (#7900) 2021-06-10 21:54:21 +02:00
Adrian Wälchli d209b68979
split `restore_training_state` into logical parts [1 / 2] (#7901)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-06-10 15:36:02 +00:00
Jirka Borovec 111287b4f9
add pre-commit hooks (#7906) 2021-06-10 16:45:54 +02:00
Carlos Mocholí 839019a3a7
Remove legacy teardown check in train loop (#7917) 2021-06-10 15:02:14 +02:00
Carlos Mocholí b45a89a256
Clean-up after logger connector redesign 2/2 (#7631)
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-06-10 12:09:01 +00:00
Sean Naren 07b69231ad
Remove fn check for ipu output (#7915) 2021-06-10 11:35:32 +00:00
Carlos Mocholí 580a3b5e32
Remove dead code (#7910) 2021-06-10 11:38:33 +01:00
Carlos Mocholí df812398b5
Clean-up after logger connector redesign 1/2 (#7909) 2021-06-10 06:21:03 +01:00
Carlos Mocholí ec4f8856af
Enable logger connector re-design (#7891)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-06-09 14:24:45 +00:00
Jirka Borovec 15be986558
add logger to __all__ (#6854) 2021-06-09 13:07:02 +00:00
ananthsub 6fee9262ff
Deprecate `LightningDataModule` lifecycle properties (#7657)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-06-09 12:30:40 +00:00
Adrian Wälchli 764d2c775e
refactor CheckpointConnector.restore_weights (#7862) 2021-06-09 09:55:08 +00:00
Kaushik B 7f4ef6d135
Fix logs overwriting issue for remote fs (#7889)
* Fix logs overwriting issue for remote fs

* Add test
2021-06-09 11:05:01 +02:00
Carlos Mocholí c310ce661e
Logger connector re-design `_Metadata.reduce_fx` fixes. (#7890) 2021-06-09 01:21:01 -07:00
Carlos Mocholí b214442e74
New logger connector code (#7882)
* New logger connector code

* Update CHANGELOG

* Update requirements

* Fix import path

* Add new suffix

* Tests

* Minor changes

* Rename and reorder

* Fix import

* Formatting

* Fix with seed_everything?

* Fix test?

* Fix test?

* Minor change

* Minor changes

* Minor changes

* Force float

* Fix minimal bug

* Fix minimal bug

* Update with latest changes

* Fix import

* bad merge

* update typing

Co-authored-by: tchaton <thomas@grid.ai>
2021-06-08 20:20:17 +00:00
Carlos Mocholí b74f8ac149
Use `apply_to_collection` in `metrics_to_scalars` (#7888)
* Use `apply_to_collection` in `metrics_to_scalars`

* Typing

* Update CHANGELOG

* Update pytorch_lightning/utilities/metrics.py

* Whitespace
2021-06-08 12:54:32 -04:00
Jirka Borovec 0fda862274
Refactor notebooks (#7752)
* drop notebooks

* add submodule

* copy notebooks

* docs include ipynb

* fix headers

* CI

* readthedocs

* manifest

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* req

* workdir

* pandoc

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* pandoc

* manifest

* Apply suggestions from code review

* fix versions

* checkout

* `git submodule update --init --recursive --remote`

* notebooks @docs

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-06-08 16:30:13 +00:00
Jirka Borovec 4f3af42f83
better use of void (#7809)
* use void

* format
2021-06-08 15:36:50 +00:00
Carlos Mocholí 5593b6f772
Merge pull request #7872 from PyTorchLightning/refactor/logger-poc-changes
Random fixes for logger connector PoC
2021-06-08 09:04:16 -04:00
Carlos Mocholí 9d315be4df
Only track dev debugger events if enabled (#7875) 2021-06-08 12:11:20 +00:00
Carlos Mocholí 8cc55ebdb0
Add `log_grad_norm` hook to `LightningModule` (#7873) 2021-06-08 12:09:06 +01:00
Luis Perez f9fccdfb39
Move `training_output` validation to after `train_step_end` (#7868)
* move validation to after aggregation

* changelog

* add test for training_step_end

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-06-08 08:37:50 +00:00
Carlos Mocholí 3427cb728d
Stricter `FxValidator` and add hooks (#7874)
* Stricter FxValidator and add hooks

* Update CHANGELOG
2021-06-08 08:26:05 +01:00
Adrian Wälchli ce976769ef
update fsspec to 2021.06.0 (#7869) 2021-06-08 05:05:19 +05:30
Adrian Wälchli 20f37b85b6
add warning when Trainer(log_every_n_steps) not well chosen (#7734)
* add warning

* update changelog

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger check

* add docstring for test

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-06-07 12:40:43 +00:00
Sean Naren 41be61c6f2
[IPU] Add hooks for IPU lifecycle 4/5 (#7864) 2021-06-07 12:06:41 +00:00