Commit Graph

75 Commits

Author SHA1 Message Date
Adrian Wälchli 54147e0745
Update Fabric docs navigation (#16957) 2023-03-06 16:13:51 +01:00
Jirka Borovec 4e3273a81f
docs: deploy all (#16951) 2023-03-05 10:41:00 +00:00
Adrian Wälchli 5997332b93
Skip flaky ddp-spawn test on windows (#16942) 2023-03-03 15:26:28 +01:00
Jirka Borovec f697fff5db
docs: rename source-app (#16863)
* docs: rename source-app

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ci

* group check

* trigger

* param

* fix

* cleaning

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-28 10:04:43 +01:00
Jirka Borovec 0be025e8b7
rename docs/source-app & adjust docs links for lightning (#16676)
* update CI

* config / import

* lightning_app imports

* source/ dir

* html

* ci: dirs

* pr

* req dir

* on push

* rename

* drop

* cleaning
2023-02-13 10:59:02 +01:00
Jirka Borovec ec001cd64e
ci: replace pip cache with wheels (#16668) 2023-02-07 15:37:34 +00:00
Adrian Wälchli acb7ee223c
Ignore generated package files (#16605)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-02-02 09:24:07 +00:00
Jirka Borovec 7d4780adb1
move pytorch_lightning >> lightning/pytorch (#16594) 2023-02-01 18:22:42 +00:00
Jirka Borovec fda354a1f1
move lightning_fabric >> lightning/fabric (#16589)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-01 17:18:32 +00:00
Jirka Borovec 34140c0603
move lightning_app >> lightning/app (#16553)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-01 06:29:16 +01:00
Kushashwa Ravi Shrimali d738ab17e6
Init: Models store API (#15811)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-01-27 12:27:04 +01:00
Jirka Borovec 799ced8430
ci: replace flake8 by ruff (#16433)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-01-19 11:48:28 -05:00
Akihiro Nitta fb12879fde
[docs][App] Include components in the API reference (#16414) 2023-01-18 09:06:21 +00:00
thomas chaton 592b12658a
[App] PoC: Add support for Request (#16047) 2022-12-16 14:19:10 +00:00
Jirka Borovec 61ee3fabc3
PKG: distribute single semver (#15374)
* global
* distrib ver
* codeowners
* Apply suggestions from code review

Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-11-12 15:36:36 +00:00
Jirka Borovec d5003b1c07
prune installation artifact (#15558)
* prune installation artifact

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-11-08 09:54:38 -05:00
Carlos Mocholí 0c63534b7e
remove source-lit docs 2 (#15527) 2022-11-04 18:01:04 +01:00
William Falcon 9328da439b
docs updates 1/n (#15473)
* docs

* docs updates

* docs updates

* docs updates

* docs updates

* d

* d

* d

* d

* d

* d

* ??

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d1

* d

* d

* d

* d

* d

* d

* d

* d

* d

* d

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* new title

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* only select from parent

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use OSS template

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* only select from parent

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update docs/README.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: William Falcon <williamfalcon@Williams-MacBook-Pro-2.local>
Co-authored-by: William Falcon <williamfalcon@Williams-MBP-2.lan>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-11-03 10:55:30 -04:00
Ethan Harris bbf7848a5f
[App] Fix cluster logic (#15383) 2022-10-28 15:35:21 +01:00
thomas chaton b936fd4380
[app] Add CloudCompute ID serializable within the flow and works state (#14819) 2022-10-04 19:46:44 +00:00
thomas chaton 86fd5b22d4
(app) Make Logging DEBUG mode lazy (#14464) 2022-09-12 14:47:24 +00:00
Jirka Borovec 208bf6faa8
prepare space for fused docs (#14160)
* copy app conf

* ci + req.

* script symlink

* wip

* keep only App

* add also PL

* lightning

* artifact
2022-08-30 09:25:05 -04:00
Akihiro Nitta d5f35ece72
CI/CD: Add CUDA version to docker image tags (#13831)
* append cuda version to tags

* revertme: push to hub

* Update docker readme

* Build base-conda-py3.9-torch1.12-cuda11.3.1

* Use new images in conda tests

* revertme: push to hub

* Revert "revertme: push to hub"

This reverts commit 0f7d534b2a.

* Revert "revertme: push to hub"

This reverts commit 46a05fccbb.

* Run conda if workflow edited

* Run gpu testing if workflow edited

* Use new tags in release/Dockerfile

* Build base-cuda and PL release images with all combinations

* Update release docker

* Update conda from py3.9-torch1.12 to py3.10-torch.1.12

* Fix ubuntu version

* Revert conda

* revertme: push to hub

* Don't build Python 3.10 for now...

* Fix pl release builder

* updating version contribute to the error? https://github.com/docker/buildx/issues/456

* Update actions' versions

* Update slack user to notify

* Don't use 11.6.0 to avoid bagua incompatibility

* Don't use 11.1, and use 11.1.1

* Update .github/workflows/ci-pytorch_test-conda.yml

Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com>

* Update trigger

* Ignore artfacts from tutorials

* Trim docker images to distribute

* Add an image for tutorials

* Update conda image 3.8x1.10

* Try different conda variants

* No need to set cuda for conda jobs

* Update who to notify ipu failure

* Don't push

* update filenaem

Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com>
2022-08-10 10:37:50 +00:00
Laverne Henderson e33d25fb28
Porting latest App docs update (#13680)
* PRs 909,910,911, and 912

moves last 4 commits to the private re;po to the OS repo

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation error

* Fixes API links and validation issues

* Update docs/source-app/examples/file_server/app.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Fix Python validation errors

* update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-08-01 10:25:40 -04:00
thomas chaton aefb9ab43f
(app) Introduce LightningTrainingComponent (#13830) 2022-07-29 16:44:52 +02:00
thomas chaton 4c35867b61
[App] Introduce Commands (#13602) 2022-07-25 17:13:46 +00:00
thomas chaton 5e26840f94
Introduce ServableModuleValidator Callback (#13614)
* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* Update tests/tests_pytorch/serve/test_servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update tests/tests_pytorch/serve/test_servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update src/pytorch_lightning/serve/servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update src/pytorch_lightning/serve/servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update src/pytorch_lightning/serve/servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Typing improvements

* wip

* update doc

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update examples/pl_servable_module/production.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* update

* update

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-15 11:07:40 -04:00
otaj 663d4c9c28
Add BaseModelCheckpoint class to inherit from (#13024)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-30 10:07:46 +00:00
Jirka Borovec d2e4e7e003
create meta package [RFC] (#13327)
* placeholder

* move setup_tools & abstract about

* adjust lightning-app

* notes

* lightning about

* lightning init

* CI check

* ci

* install

* adjust manifest & mv chlog

* manifest

* pkg

* mv __setup__

* parse_requirements

* lit

* ci - pytorch

* wrap func

* ci

* cd draft

* generate lit

* pkg

* utf-8

* root pkg

* req.

* ver

* mypy

* try check

* meta pkg

* meta pkg - vars

* meta pkg - pruning

* meta pkg - fixing

* fix PL for meta

* multi-line wrapper

* hack manifest

* ci

* fix docstr

* fixing

* ci & mypy

* links
2022-06-27 09:34:18 -04:00
Adrian Wälchli 602ee65f74
Docs for LAI (#13312)
* edit

* docs

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixing

* clean generated

* ignore

* pre-commit

* ci

* ci

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-06-16 23:07:30 -04:00
Jirka Borovec b58577fd4d
Future 3/n: docs adjustment (#13299)
* docs: rename source >> source-PL

* docs: fix typing

* readthedocs

* update paths & codeowners

* source-pytorch

* ci

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-15 10:54:53 -04:00
stanbiryukov 8034919c44
Remove deprecated `TestTubeLogger` (#12859)
* remove deprecated test_tube logger

* remove testube from logger __init__

* remove relevant testtube tests

* update CHANGELOG with removal of deprecated `TestTubeLogger`
2022-04-24 20:05:48 +02:00
Rohit Gupta 82c8875f33
Add `LightningModule.lr_scheduler_step` (#10249)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-01-12 03:53:49 +00:00
thomas chaton 9e844d9db6
Lite Docs and Example Improvements (#10303)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-11-02 16:13:01 +01:00
Adrian Wälchli 3cd65b592b
Lightning Lite Examples (#9987)
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: four4fish <88516121+four4fish@users.noreply.github.com>
Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Pietro Lesci <61748653+pietrolesci@users.noreply.github.com>
2021-11-02 08:04:29 +00:00
Rohit Gupta 23e8b59ae7
Add `configure_gradient_clipping` hook in `LightningModule` (#9584)
* init hook

* docs

* dep train args

* update tests

* doc

* doc

* .gitignore

* not dep

* add trainer args

* add & update tests

* fix tests

* pre-commit

* docs

* add docs

* add exception

* code review

* deepspeed

* update tests

* not

* try fix

* Apply suggestions from code review

* update deepspeed

* disable some tests

* disable some tests

* enable all tests
2021-10-13 20:15:13 +05:30
Jirka Borovec 982a9560a5
Update notebooks submodule and add tutorial view to docs (#9420)
Co-authored-by: Ethan Harris <ethanwharris@gmail.com>
2021-09-16 15:14:37 +01:00
Jakub Kuszneruk ee3787216a
Adapt `NeptuneLogger` to new `neptune-client` api (#6867)
* Initial split to NeptuneLegacyLogger and NeptuneLogger

* Adapt NeptuneLogger to neptune-pytorch-lightning repo version

* Fix stylecheck and tests

* Fix style and PR suggestions

* Expect Run object in NeptuneLogger.init

* Model checkpoint support and restructured tests

* Reformat code - use " instead of '

* Fix logging INTEGRATION_VERSION_KEY

* Update CHANGELOG.md

* Fix stylecheck

* Remove NeptuneLegacyLogger

* updated neptune-related docstrings

* PR suggestions

* update CODEOWERS file
* move import logic to imports.py
* minor neptune.py improvements

* formatting fixes and minor updates

* Fix generation of docs

* formatting fixes and minor updates

* fix

* PR fixes vol. 2

* define return type of _dict_paths method
* bump required version of `neptune-client`

* Enable log_* functions

* Update pytorch_lightning/loggers/neptune.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Revert "Enable log_* functions"

This reverts commit 050a436899b7f3582c0455dc27b171335b85a3a5.

* Make global helper lists internal

* Logger's `name` and `version` methods return proper results

* Initialize Run and its name and id at logger init level

* Make _init_run_instance static

* Add pre-commit hook code changes.

* Fix blacken-docs check

* Fix neptune doctests and test_all

* added docs comment about neptune-specific syntax

* added docs comment about neptune-specific syntax in the loggers.rst

* fix

* Add pickling test

* added myself to neptune codeowners

* Enable some of deprecated log_* functions

* Restore _run_instance for unpickled logger

* Add `step` parameter to log_* functions

* Fix stylecheck

* Fix checkstyle

* Fix checkstyle

* Update pytorch_lightning/loggers/neptune.py

Co-authored-by: thomas chaton <thomas@grid.ai>

* Fix tests

* Fix stylecheck

* fixed project name

* Fix windows tests

* Fix stylechecks

* Fix neptune docs tests

* docformatter fixes

* De-duplicate legacy_kwargs_msg

* Update .github/CODEOWNERS

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: kamil-kaczmarek <kamil.kaczmarek@neptune.ml>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-09-10 18:48:58 +02:00
Jirka Borovec 7978a5376d
Ipynb update (#8004)
* git submodule update --remote

* update notebooks in docs

* prune

* _notebooks

* docs

* path

* path

* ignore

* head
2021-06-17 16:46:05 +02:00
Adrian Wälchli 20a5e09e33
fix myst-parser warning blocking docs ci (#7967) 2021-06-14 11:17:53 +00:00
Aniket Maurya 0bad2186c1
Added Vulture dead code checker (#5654)
* integrated vulture CI

* added vulture in workflows

* added vulture in workflows

* vulture logs verbose set false

* Apply suggestions from code review

* ignore name list and args to underscore naming

* add ignore names

* deadcode whitelist

* deadcode whitelist

* Apply suggestions from code review

Co-authored-by: Rahul Jha <rahul722j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update whitelist

* Sort

* Updates

* Updates

* Apply suggestions from code review

* Updates

Co-authored-by: Aniket Maurya <aniket.maurya@gdn-commerce.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Rahul Jha <rahul722j@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Ethan Harris <ethanwharris@gmail.com>
2021-06-02 16:19:10 +01:00
Carlos Mocholí 36d180e532
Refactor base profilers 3/5 (#6621)
Co-authored-by: tchaton <thomas@grid.ai>
2021-03-23 10:07:35 +00:00
camruta e2e1de0fb7
Add teardown method to BaseProfiler. (#6370)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-03-22 11:49:06 +00:00
chaton 6bc4490d01
[HotFix] Resolve TPU Training (#6027)
* fix tpus

* update

* add back reduction in val_loss

* resolve some bugs with TPUs

* update changelog

* update on comments

* forgot status

* Fix train_bn arg

* resolve comments

* update on comments

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-02-17 16:40:13 +00:00
chaton 141316fb29
[BugFix] Resolve bugs in computer_vision_fine_tuning.py example (#5985)
* update the script to use DataModule

* add message at for the frozen parameters

* add message about trainable parameters

* resolve flake8
2021-02-16 21:01:04 +00:00
Justus Schock da6dbc8d1d
PoC: Accelerator refactor (#5743)
* restoring the result from subprocess

* fix queue.get() order for results

* add missing "block_backward_sync" context manager

* add missing "block_backward_sync" context manager

* fix sync_batchnorm

* fix supported gpu-ids for tuple

* fix clip gradients and inf recursion

* accelerator selection: added cluster_environment plugin

* fix torchelastic test

* fix reduce early stopping decision for DDP

* fix tests: callbacks, conversion to lightning optimizer

* fix lightning optimizer does not pickle

* fix setting benchmark and deterministic option

* fix slurm amp test

* fix prepare_data test and determine node_rank

* fix retrieving last path when testing

* remove obsolete plugin argument

* fix test: test_trainer_config

* fix torchscript tests

* fix trainer.model access

* move properties

* fix test_transfer_batch_hook

* fix auto_select_gpus

* fix omegaconf test

* fix test that needs to simulate slurm ddp

* add horovod plugin

* fix test with named arguments

* clean up whitespace

* fix datamodules test

* remove old accelerators

* fix naming

* move old plugins

* move to plugins

* create precision subpackage

* create training_type subpackage

* fix all new import errors

* fix wrong arguments order passed to test

* fix LR finder

* Added sharded training type and amp plugin

* Move clip grad to precision plugin

* Added sharded spawn, select accelerators based on distributed_backend + enable custom fp16 plugin automatically

* Fix import issue, attempting to fix tests

* Fix initial test

* Reflect hook logic from master, should wrap model after move to device

* Optional state consolidation, since master has optimizers not wrapped

* change attribute for instance test

* reset optimizers

optimizers are not used in main process, so state would be wrong.

* legacy

* imports in accel

* legacy2

* trainer imports

* fix import errors after rebase

* move hook to new setup location

* provide unwrapping logic

* fix trainer callback system

* added ddp2 implementation

* fix imports .legacy

* move plugins

* restore legacy

* drop test.py from root

* add tpu accelerator and plugins

* fixes

* fix lightning optimizer merge

* reset bugreportmodel

* unwrapping

* step routing forward

* model access

* unwrap

* opt

* integrate distrib_type

* sync changes

* sync

* fixes

* add forgotten generators

* add missing logic

* update

* import

* missed imports

* import fixes

* isort

* mv f

* changelog

* format

* move helper to parallel plugin

* d

* add world size

* clean up

* duplicate

* activate ddp_sharded and tpu

* set nvidia flags

* remove unused colab var

* use_tpu <-> on_tpu attrs

* make some ddp_cpu and clusterplugin tests pass

* Ref/accelerator connector (#5742)

* final cleanup

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* connector cleanup

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* trainer cleanup

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* accelerator cleanup + missing logic in accelerator connector

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* add missing changes to callbacks

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* reflect accelerator changes to lightning module

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* clean cluster envs

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* cleanup plugins

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* add broadcasting

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* yapf

* remove plugin connector

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* plugins

* manual optimization

* update optimizer routing

* add rank to torchelastic

* fix memory mixed precision

* setstate on trainer for pickling in ddp spawn

* add predict method

* add back commented accelerator code

* adapt test for sync_batch_norm to new plugin

* fix deprecated tests

* fix ddp cpu choice when no num_processes are given

* yapf format

* skip a memory test that cannot pass anymore

* fix pickle error in spawn plugin

* x

* avoid

* x

* fix cyclic import in docs build

* add support for sharded

* update typing

* add sharded and sharded_spawn to distributed types

* make unwrap model default

* refactor LightningShardedDataParallel similar to LightningDistributedDataParallel

* update sharded spawn to reflect changes

* update sharded to reflect changes

* Merge 1.1.5 changes

* fix merge

* fix merge

* yapf isort

* fix merge

* yapf isort

* fix indentation in test

* copy over reinit scheduler implementation from dev1.2

* fix apex tracking calls with dev_debugger

* reduce diff to dev1.2, clean up

* fix trainer config test  when gpus>0 and num_processes >0 and ddp_cpu

* sort plugin tests legacy/new

* fix error handling for amp on cpu

* fix merge


fix merge


fix merge

* [Feat] Resolve manual_backward (#5837)

* resolve manual_backward

* resolve flake8

* update

* resolve for ddp_spawn

* resolve flake8

* resolve flake8

* resolve flake8

Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>

* fix tests/accelerator tests on cpu

* [BugFix] Resolve manual optimization (#5852)

* resolve manual_optimization

* update

* update

Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>

* Remove copy trainer parameters to happen earlier within the loop and add safe guard to get ref model (#5856)

* resovle a bug

* Accelerator refactor sharded rpc (#5854)

* rpc branch

* merge

* update handling of rpc

* make devices etc. Optional in RPC

* set devices etc. later if necessary

* remove devices from sequential

* make devices optional in rpc

* fix import

* uncomment everything

* fix cluster selection

Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>

* resolve bug

* fix assert in rpc test

* resolve a test

* fix docs compilation

* accelerator refactor - fix for sharded parity test (#5866)

* fix memory issue with ddp_spawn

* x


x


x


x


x


x


x


x


x

* x

* Remove DDP2 as this does not apply

* Add missing pre optimizer hook to ensure lambda closure is called

* fix apex docstring

* [accelerator][BugFix] Resolve some test for 1 gpu (#5863)

* update

* revert init

* resolve a bug

* update

* resolve flake8

* update

* update

* update

* revert init

* resolve a bug

* update

* resolve flake8

* update

* update

* update

* update

* update

* revert init

* resolve a bug

* update

* resolve flake8

* update

* update

* update

* revert init

* update

* resolve flake8

* update

* update

* update

* update

* update

* all_gather

* update

* make plugins work, add misconfig for RPC

* update

* update

* remove breaking test

* resolve some tests

* resolve flake8

* revert to ddp_spawn

Co-authored-by: root <root@ip-172-31-88-60.ec2.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
Co-authored-by: Justus Schock <justus.schock@rwth-aachen.de>

* yapf isort

* resolve flake8

* fix apex doctests

* fix apex doctests 2

* resolve docs

* update drone

* clean env

* update

* update

* update

* update

* merge

* Fix RPC related tests, clean out old API, update for new accelerator API [skip ci] (#5881)

* Fix RPC related tests, clean out old API, update for new accelerator API

* Move tests out of legacy folder, update paths and names

* Update test_remove_1-4.py

* Expose properties for tpu cores/gpus/num_gpus

* Add root GPU property

* Move properties to properties.py

* move tests that were previously in drone

* Fix root GPU property (#5908)

* Move root GPU to property, remove horovod set as this is handled in horovod plugin, ensure we mock correctly to set GPU accelerator

* Add missing tests back

* fix best model path transfer when no checkpoint callback available

* Fix setup hook order [wip] (#5858)

* Call trainer setup hook before accelerator setup

* Add test case

* add new test

* typo

* fix callback order in test

Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* rename ddp sequential -> rpc sequential for special test

* revert

* fix stupid merge problem

* Use property in connector for sampler (#5913)

* merge the import conflicts

* fix spawning of processes in slurm

* [wip] Fix some bugs for TPU [skip ci] (#5878)

* fixed for single tpu

* fixed spawn

* fixed spawn

* update

* update

* wip

* resolve bugs

* resolve bug

* update on comment

* removed decorator

* resolve comments

* set to 4

* update

* update

* need cleaning

* update

* update

* update

* resolve flake8

* resolve bugs

* exclude broadcast

* resolve bugs

* change test

* update

* update

* skip if meet fails

* properly raise trace

* update

* add catch

* wrap test

* resolve typo

* update

* typo

Co-authored-by: Lezwon Castelino <lezwon@gmail.com>
Co-authored-by: Your Name <you@example.com>

* resolve some tests

* update

* fix imports

* update

* resolve flake8

* update azure pipeline

* skip a sharded test on cpu that requires a gpu

* resolve tpus

* resolve bug

* resolve flake8

* update

* updat utils

* revert permission change on files

* suggestions from carlos

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* remove unrelated formatting changes

* remove incomplete comment

* Update pytorch_lightning/accelerators/__init__.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* remove unrelated formatting change

* add types

* warn 1.7 ddp manual backward only if ddp kwarg unset

* yapf + isort

* pep8 unused imports

* fix cyclic import in docs

* Apply suggestions from code review

* typer in accelerator.py

* typo

* Apply suggestions from code review

* formatting

* update on comments

* update typo

* Update pytorch_lightning/trainer/properties.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* update

* suggestion from code review

* suggestion from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: root <root@ip-172-31-88-60.ec2.internal>
Co-authored-by: Lezwon Castelino <lezwon@gmail.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-12 15:48:56 -05:00
Rohit Gupta cb67e1d0b2 Separate epoch validation from step validation (#5208)
* Seperate epoch validaton from step validation

* update system

* test

* baked logic in callbacks

* unbake logic in callbacks

* fix the call for scheduler

* use property

* pep

* correct rebase

* gitignore

* ref

* add tests

* fix

* add early stopping test

* trigger

* chlog

* rev

* 1.3

* log

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update pytorch_lightning/trainer/training_loop.py

* Update CHANGELOG.md

* Apply suggestions from code review

Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

(cherry picked from commit e429f97b67)
2021-02-08 20:22:39 +01:00
chaton e425bf3ba9
[BugOnFeat] Resolve bug with Finetuning (#5744)
* resolve bug + add doc

* Update pytorch_lightning/callbacks/finetuning.py

* resolve bug

* start adding more test

* add more tests for finetuning callback functions

* rename to flatten_modules

* resolve doc

* Update pytorch_lightning/callbacks/finetuning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* resolve comments

* remove update on BoringModel

* update on comments

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-02-04 18:36:54 +00:00
Jirka Borovec dee5553b2b
move to Pages dir (#4869)
* folders

* common / advanced / extensions

* paths

* flake8

* isort

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-01-26 15:07:07 -05:00
Jirka Borovec 9dd04028d5 tests for legacy checkpoints (#5223)
* wip

* generate

* clean

* tests

* copy

* download

* download

* download

* download

* download

* download

* download

* download

* download

* download

* download

* flake8

* extend

* aws

* extension

* pull

* pull

* pull

* pull

* pull

* pull

* pull

* try

* try

* try

* got it

* Apply suggestions from code review

(cherry picked from commit 72525f0a83)
2021-01-26 14:27:56 +01:00