Commit Graph

80 Commits

Author SHA1 Message Date
Adrian Wälchli c912c4b729
remove legacy accelerators (#5949)
* remove legacy accelerators

* update imports

* formatting

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-14 16:03:45 +00:00
Justus Schock da6dbc8d1d
PoC: Accelerator refactor (#5743)
* restoring the result from subprocess

* fix queue.get() order for results

* add missing "block_backward_sync" context manager

* add missing "block_backward_sync" context manager

* fix sync_batchnorm

* fix supported gpu-ids for tuple

* fix clip gradients and inf recursion

* accelerator selection: added cluster_environment plugin

* fix torchelastic test

* fix reduce early stopping decision for DDP

* fix tests: callbacks, conversion to lightning optimizer

* fix lightning optimizer does not pickle

* fix setting benchmark and deterministic option

* fix slurm amp test

* fix prepare_data test and determine node_rank

* fix retrieving last path when testing

* remove obsolete plugin argument

* fix test: test_trainer_config

* fix torchscript tests

* fix trainer.model access

* move properties

* fix test_transfer_batch_hook

* fix auto_select_gpus

* fix omegaconf test

* fix test that needs to simulate slurm ddp

* add horovod plugin

* fix test with named arguments

* clean up whitespace

* fix datamodules test

* remove old accelerators

* fix naming

* move old plugins

* move to plugins

* create precision subpackage

* create training_type subpackage

* fix all new import errors

* fix wrong arguments order passed to test

* fix LR finder

* Added sharded training type and amp plugin

* Move clip grad to precision plugin

* Added sharded spawn, select accelerators based on distributed_backend + enable custom fp16 plugin automatically

* Fix import issue, attempting to fix tests

* Fix initial test

* Reflect hook logic from master, should wrap model after move to device

* Optional state consolidation, since master has optimizers not wrapped

* change attribute for instance test

* reset optimizers

optimizers are not used in main process, so state would be wrong.

* legacy

* imports in accel

* legacy2

* trainer imports

* fix import errors after rebase

* move hook to new setup location

* provide unwrapping logic

* fix trainer callback system

* added ddp2 implementation

* fix imports .legacy

* move plugins

* restore legacy

* drop test.py from root

* add tpu accelerator and plugins

* fixes

* fix lightning optimizer merge

* reset bugreportmodel

* unwrapping

* step routing forward

* model access

* unwrap

* opt

* integrate distrib_type

* sync changes

* sync

* fixes

* add forgotten generators

* add missing logic

* update

* import

* missed imports

* import fixes

* isort

* mv f

* changelog

* format

* move helper to parallel plugin

* d

* add world size

* clean up

* duplicate

* activate ddp_sharded and tpu

* set nvidia flags

* remove unused colab var

* use_tpu <-> on_tpu attrs

* make some ddp_cpu and clusterplugin tests pass

* Ref/accelerator connector (#5742)

* final cleanup

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* connector cleanup

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* trainer cleanup

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* accelerator cleanup + missing logic in accelerator connector

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* add missing changes to callbacks

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* reflect accelerator changes to lightning module

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* clean cluster envs

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* cleanup plugins

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* add broadcasting

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* yapf

* remove plugin connector

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* plugins

* manual optimization

* update optimizer routing

* add rank to torchelastic

* fix memory mixed precision

* setstate on trainer for pickling in ddp spawn

* add predict method

* add back commented accelerator code

* adapt test for sync_batch_norm to new plugin

* fix deprecated tests

* fix ddp cpu choice when no num_processes are given

* yapf format

* skip a memory test that cannot pass anymore

* fix pickle error in spawn plugin

* x

* avoid

* x

* fix cyclic import in docs build

* add support for sharded

* update typing

* add sharded and sharded_spawn to distributed types

* make unwrap model default

* refactor LightningShardedDataParallel similar to LightningDistributedDataParallel

* update sharded spawn to reflect changes

* update sharded to reflect changes

* Merge 1.1.5 changes

* fix merge

* fix merge

* yapf isort

* fix merge

* yapf isort

* fix indentation in test

* copy over reinit scheduler implementation from dev1.2

* fix apex tracking calls with dev_debugger

* reduce diff to dev1.2, clean up

* fix trainer config test  when gpus>0 and num_processes >0 and ddp_cpu

* sort plugin tests legacy/new

* fix error handling for amp on cpu

* fix merge


fix merge


fix merge

* [Feat] Resolve manual_backward (#5837)

* resolve manual_backward

* resolve flake8

* update

* resolve for ddp_spawn

* resolve flake8

* resolve flake8

* resolve flake8

Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>

* fix tests/accelerator tests on cpu

* [BugFix] Resolve manual optimization (#5852)

* resolve manual_optimization

* update

* update

Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>

* Remove copy trainer parameters to happen earlier within the loop and add safe guard to get ref model (#5856)

* resovle a bug

* Accelerator refactor sharded rpc (#5854)

* rpc branch

* merge

* update handling of rpc

* make devices etc. Optional in RPC

* set devices etc. later if necessary

* remove devices from sequential

* make devices optional in rpc

* fix import

* uncomment everything

* fix cluster selection

Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>

* resolve bug

* fix assert in rpc test

* resolve a test

* fix docs compilation

* accelerator refactor - fix for sharded parity test (#5866)

* fix memory issue with ddp_spawn

* x


x


x


x


x


x


x


x


x

* x

* Remove DDP2 as this does not apply

* Add missing pre optimizer hook to ensure lambda closure is called

* fix apex docstring

* [accelerator][BugFix] Resolve some test for 1 gpu (#5863)

* update

* revert init

* resolve a bug

* update

* resolve flake8

* update

* update

* update

* revert init

* resolve a bug

* update

* resolve flake8

* update

* update

* update

* update

* update

* revert init

* resolve a bug

* update

* resolve flake8

* update

* update

* update

* revert init

* update

* resolve flake8

* update

* update

* update

* update

* update

* all_gather

* update

* make plugins work, add misconfig for RPC

* update

* update

* remove breaking test

* resolve some tests

* resolve flake8

* revert to ddp_spawn

Co-authored-by: root <root@ip-172-31-88-60.ec2.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
Co-authored-by: Justus Schock <justus.schock@rwth-aachen.de>

* yapf isort

* resolve flake8

* fix apex doctests

* fix apex doctests 2

* resolve docs

* update drone

* clean env

* update

* update

* update

* update

* merge

* Fix RPC related tests, clean out old API, update for new accelerator API [skip ci] (#5881)

* Fix RPC related tests, clean out old API, update for new accelerator API

* Move tests out of legacy folder, update paths and names

* Update test_remove_1-4.py

* Expose properties for tpu cores/gpus/num_gpus

* Add root GPU property

* Move properties to properties.py

* move tests that were previously in drone

* Fix root GPU property (#5908)

* Move root GPU to property, remove horovod set as this is handled in horovod plugin, ensure we mock correctly to set GPU accelerator

* Add missing tests back

* fix best model path transfer when no checkpoint callback available

* Fix setup hook order [wip] (#5858)

* Call trainer setup hook before accelerator setup

* Add test case

* add new test

* typo

* fix callback order in test

Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* rename ddp sequential -> rpc sequential for special test

* revert

* fix stupid merge problem

* Use property in connector for sampler (#5913)

* merge the import conflicts

* fix spawning of processes in slurm

* [wip] Fix some bugs for TPU [skip ci] (#5878)

* fixed for single tpu

* fixed spawn

* fixed spawn

* update

* update

* wip

* resolve bugs

* resolve bug

* update on comment

* removed decorator

* resolve comments

* set to 4

* update

* update

* need cleaning

* update

* update

* update

* resolve flake8

* resolve bugs

* exclude broadcast

* resolve bugs

* change test

* update

* update

* skip if meet fails

* properly raise trace

* update

* add catch

* wrap test

* resolve typo

* update

* typo

Co-authored-by: Lezwon Castelino <lezwon@gmail.com>
Co-authored-by: Your Name <you@example.com>

* resolve some tests

* update

* fix imports

* update

* resolve flake8

* update azure pipeline

* skip a sharded test on cpu that requires a gpu

* resolve tpus

* resolve bug

* resolve flake8

* update

* updat utils

* revert permission change on files

* suggestions from carlos

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* remove unrelated formatting changes

* remove incomplete comment

* Update pytorch_lightning/accelerators/__init__.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* remove unrelated formatting change

* add types

* warn 1.7 ddp manual backward only if ddp kwarg unset

* yapf + isort

* pep8 unused imports

* fix cyclic import in docs

* Apply suggestions from code review

* typer in accelerator.py

* typo

* Apply suggestions from code review

* formatting

* update on comments

* update typo

* Update pytorch_lightning/trainer/properties.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* update

* suggestion from code review

* suggestion from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: root <root@ip-172-31-88-60.ec2.internal>
Co-authored-by: Lezwon Castelino <lezwon@gmail.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-12 15:48:56 -05:00
Kaushik B 4857546c25
Fix: Failing test in data_modules(dp) (#5924)
* Update test_datamodules.py

* fix code format issue

* fix test restore

* fix code format issue
2021-02-11 17:32:46 +00:00
Teddy Koker 253e57c2c2
Feature: LightningDataModule.from_datasets(...) (#5133)
* add class method

* add tests

* docstring

* pep

* Add type annotations

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* pep

* fix import

* remove num_workers inference

* Update pytorch_lightning/core/datamodule.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update pytorch_lightning/core/datamodule.py

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* Update pytorch_lightning/core/datamodule.py

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* fix syntax

* typing fix

* list -> sequence

* list -> sequence

* missing import

* fix test

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-02-11 14:32:41 +00:00
Rohit Gupta 8e9a026bc3
[tests/models] refactor with BoringModel (#5507)
* update with BoringModel

* update with BoringModel

* step

* try TPU

* TPU

* update tests

* update tpu tests

* self

* fix

* dp

* update tests

* ref

* update tests

* fix tpu tests

* fix dp and run_prediction

* dp

* only dp

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-02-11 14:32:07 +00:00
Jirka Borovec b434c479e7
Quantisation (#5706)
* empty

* sq

* obs


* int

* ts

* helpers

* chlog

* yapf

* avg

* dupl

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* fixes

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* fixes

* note

* warn

* 45

* link

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* yapf

* flake8

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-02-11 07:04:57 -05:00
Jirka Borovec a0f7831278
fix miss-leading imports in tests (#5873)
* fix imorts

* .
2021-02-09 05:10:52 -05:00
Jirka Borovec bd920b4102
Refactor simplify tests (#5861)
* add new

* restructure

* yapf

* move

* fix
2021-02-08 11:52:02 +01:00
Jirka Borovec 4faaef7758
formatting tests: 4/n (#5846)
* models

* ckpt

* core

* log
2021-02-06 12:07:26 +01:00
ananthsub 06f65938ef Fix toggle optimizer (#5775)
* Update lightning.py

* update changelog

* add a 3 optimizer test

* resolve flake8

* remove extra code

* typo

* resolve typo

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-05 21:43:10 +01:00
Rohit Gupta 7a50c33f97 update tests with new auto_opt api (#5466)
* update tests with new auto_opt api

* Apply suggestions from code review

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-02-03 19:39:28 +01:00
Adrian Wälchli 8943d8bca0
add missing logic to new plugins and accelerator (#5734)
* add missing logic

* missed imports

* import fixes

* isort

* mv f

* changelog

* format

* move helper to parallel plugin

* d
2021-02-01 13:23:53 -05:00
Adrian Wälchli 344f3a984a
Refactor access to trainer attributes in LightningModule (#5730)
* rank access

* tests for property

* weekref

* logger

* changelog

* torchscript

* changelog

* chlog

* .

* amp

* yapf

* flake8

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-01 14:28:17 +00:00
chaton d0aaf983b9
[Feat] Adding PruningCallback (#5618)
* wip

* add pruning callback

* add condition for duplicated weights

* update on comments

* update on comments

* update on comments

* add more tests

* resolve flake8

* resolve on comments

* update changelog

* update on comments

* update on comments

* change order

* remove ddp_spawn skip

* update

* typo

* Update pytorch_lightning/callbacks/pruning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/pruning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update on comments

* forgot platform

* update on comments

* remove     @rank_zero_only

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-01-27 01:00:42 -05:00
Jirka Borovec 7e2e874d95
Refactor: legacy accelerators and plugins (#5645)
* tests: legacy

* legacy: accel

* legacy: plug

* fix imports

* mypy

* flake8
2021-01-26 20:04:36 -05:00
chaton 0435e23a64 deprecate enable_pl_optimizer as it is not restored properly (#5244)
* update

* clean test

* still in progress

* udpdate test

* update

* update

* resolve flake

* add test for zero_grad

* update

* works without accumulated_grad

* update

* update

* resolve amp

* revert back to True

* update

* clean tests

* cleaned out

* typo

* update test

* git repare bug

* remove print

* udpate

* Fix formatting/optimizer imports

* Refactor the test for cleanliness

* Add vanilla model to the test, better var names

* Fixed var names, let's clean up these mock tests

* repare test

* update test

* resolve flake8

* add manual_optimization

* update tests

* resolve flake8

* add random accumulate_grad_batches

* improve test

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update

* clean tests

* correct bug

* Apply suggestions from code review

* format

* adress comments

* update on comments

* wip

* typo

* depreceate enable_pl_optimizer

* resolve latest bugs

* update

* resolve merge

* add comment

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/deprecated_api/test_remove_1-3.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/connectors/optimizer_connector.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update on comments

* update restore

* add a property

* remove setstate as not needed anymore

* update test

* provide optimizer to on_before_zero_grad

* update on comments

* update on comments

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* mofidy import

* update changelog

* resolve flake8

* update

* update

* clean doc

Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit f2e99d617f)
2021-01-26 14:29:46 +01:00
chaton 5f3372871a
[feat] Add PyTorch Profiler. (#5560)
* add profiler

* add profiler

* update

* resolve flake8

* update doc

* update changelog

* clean doc

* delete prof file

* merge pr codebase

* update

* update doc

* update doc

* update doc

* update on comments

* update docstring

* update docstring

* try

* update test

* Update pytorch_lightning/profiler/__init__.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/profiler/__init__.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update on comments

* remove old code

* add support for ddp

* resolve flake8

* Update pytorch_lightning/profiler/__init__.py

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* resolve tests

* resolve flake8

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2021-01-26 06:48:54 -05:00
Jirka Borovec c3587d39da
prune deprecated EvalResult (#5633)
* prune EvalResult

* drop tests

* drop usage

* drop class

* prune
2021-01-26 03:09:39 -05:00
Jirka Borovec 7b30133a82
flake8 & isort (#5647) 2021-01-25 14:31:38 -05:00
NeuralLink db784225eb
summarize total size of model params in bytes (#5590)
* simplified model size calc

* fix spaces

* fix newlines

* minor refactor

* Update pytorch_lightning/core/memory.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* make model size property

* fix doctest

* Update pytorch_lightning/core/memory.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* remove explicit doctest from file

* better docs

* model precalculate size 1.0 mbs

* better comment

* Update tests/core/test_memory.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/core/test_memory.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* merge _model_size into model_size property itself

* minor comment fix

* add feature to changelog

* added precision test

* isort

* minor def name typo

* remove monkeypath set env as boringmodel wont need any torch hub cache

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-01-25 09:35:29 +01:00
Rohit Gupta 29bcf30984
[tests/core] Updated with BoringModel and added BoringDataModule (#5432)
* update with BoringModel and introduce BoringDataModule

* isort

* fix

* rm random_split

* fix test

* fix test

* update

* update test_results

* val_step

* update tests

* rebase

* rebase
2021-01-13 01:48:37 -05:00
Jirka Borovec 059f4630c8
prune check on Trainer fit result (#5453)
* prune check on Trainer fit result

* flake8

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* .

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-01-11 19:36:48 -05:00
Rohit Gupta 704e00ee7f Fix invalid value for weights_summary (#5296)
* Fix weights_summary

* use mode

* fix

* optional

* what was I thinking

(cherry picked from commit 062800aa99)
2021-01-06 12:59:32 +01:00
Jirka Borovec af833f673c
drop deprecated TrainResult (#5323)
* drop TrainResult

* .

* .

* .

* .

* .

* .
2021-01-04 09:54:21 +08:00
Jirka Borovec a884866ff0
Unify names in Utils (#5199)
* warnings

* argparse

* mutils

* xla device

* deprecated

* tests

* simple

* flake8

* fix

* flake8

* 1.4
2020-12-22 00:23:33 +01:00
Jirka Borovec 0f36525e8f
fix/enable - check F401 (#5201)
* refactor - check F401

* missed

* fix
2020-12-21 10:15:04 +01:00
Jirka Borovec 35fd6e93c7
refactor - check E501 (#5200) 2020-12-21 14:23:09 +05:30
Jirka Borovec 6d2c564bc6
refactor - check F841 (#5202) 2020-12-21 11:10:55 +05:30
Jirka Borovec a49291d98d
drop unused test with result api (#5058)
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-12-12 21:51:19 +05:30
chaton 1a970b2d8d
[hotfix] Extend Optimizer + update doc (#5095)
* resolve urgent bug

* update pr

* update doc

* update

* remove typo

* add defaults

* Update pytorch_lightning/__init__.py

* Update setup.py

* update doc

* Update docs/source/optimizers.rst

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update

* resolve doc

* debug test

* update test

* Update docs/source/optimizers.rst

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update docs/source/optimizers.rst

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update docs/source/optimizers.rst

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* remove useless import

* Update docs/source/optimizers.rst

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-12-11 14:24:59 -05:00
chaton 7755572b4f
Check if optimizer supports closure (#4981)
* check if optimizer support closure

* cleanup test

* resolve tests

* resolve flake

* update test due to patch limit

* update

* update dep

* Update tests/core/test_lightning_optimizer.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/core/test_lightning_optimizer.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* resolve bug

* update test

* resolve tests

* Update requirements/extra.txt

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* remove bolts dep

* remove bolts

* add missing bolts dep for tests

* remove need for bolts

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-12-11 14:51:45 +01:00
Jirka Borovec 4ebce38478
update usage of deprecated automatic_optimization (#5011)
* drop deprecated usage automatic_optimization

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-12-10 15:31:33 +05:30
Jirka Borovec 05f25f3a54
update usage of deprecated checkpoint_callback (#5006)
* drop usage of deprecated checkpoint_callback

* fix

* fix
2020-12-09 14:14:34 -05:00
Jirka Borovec 53d7c9555c
drop usage of deprecated distributed_backend (#5009)
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-12-09 09:18:23 +01:00
chaton 02152c1729
Simplify optimization Logic (#4984)
* Rely on ddp plugin for blocking sync behaviour, and skip if we're using manual optimization

* debug

* Revert "debug"

This reverts commit ccca6b6b

* Expose manual reduce for automatic optimization

* Add input arguments

* Enable parity test

* clean imports

* Expose hook after to ensure we reset

* Fix naming

* add

* fix test

* uniformize optimizer logic

* resolve test

* resovle flake8

* resolve amp bug

* update tests

* remove bug

* remove optimizer_step in accelerators

* typo

* update lightning optimizer

* set doesn't work with ddp_spawn

* resolve flake8

* update threshold

* ignore pyright

* correct codeFactor

* remove useless if

* remove zer_grad function

* simplify step

* remove typo

* resolve bug

* Apply suggestions from code review

* update on comments

* resolve bugs

* remove tests

* Update pytorch_lightning/trainer/configuration_validator.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* simplify testing

* add more tests

Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-12-07 12:55:49 +00:00
chaton c2e6e68c7e
optimizer clean up (#4658)
* add LightningOptimizer

* typo

* add mock closure

* typo

* remove logic in optimizer_step

* update

* update

* update

* desactivate LightningOptimizer for hovorod

* resolve flake

* typo

* check optimizer name

* change name

* added backward to LightningOptimizer

* remove use_lightning_optimizer

* move update

* simplify init

* resolve comments

* resolve bug

* update

* update

* resolve bugs

* resolve flake8

* set state

* work manual_optimizer_step

* add doc

* add enable_pl_optimizer

* make optimizer_step

* add make_optimizer_step

* add examples

* resolve test

* add test_optimizer_return_options_enable_pl_optimizer

* add enable_pl_optimizer=True

* update

* update tests

* resolve bugs

* update

* set Trainer to False

* update

* resolve bugs

* update

* remove from doc

* resolve bug

* typo

* update

* set to True

* simplification

* typo

* resolve horovod

* unwrap horovod

* remove Optimizer

* resolve horovod

* move logic to amp_backend

* doesn't seem to be pickable

* update

* add again

* resolve some bugs

* cleanup

* resolve bug with AMP

* change __repr__

* round at -12

* udpate

* update

* update

* remove from horovod

* typo

* add convert_to_lightning_optimizers in each accelerators

* typo

* forgot

* forgot a convert_to_lightning_optimizers

* update

* update

* update

* increase coverage

* update

* resolve flake8

* update

* remove useless code

* resolve comments + add support for LightningOptimizer base class

* resolve flake

* check optimizer get wrapped back

* resolve DDPSharded

* reduce code

* lightningoptimizer

* Update pytorch_lightning/core/optimizer.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update pytorch_lightning/core/lightning.py

* remove reference to step function

* Apply suggestions from code review

* update on comments

* resolve

* Update CHANGELOG.md

* add back training_step in apex and native_amp

* rename optimizer_step

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-01 00:09:46 +00:00
Rohit Gupta 4c7ebdc32b
Add dirpath and filename parameter in ModelCheckpoint (#4213)
* Add dirpath and filename parameter in ModelCheckpoint

* remove old function

* chlog

* codefactor

* update tests

* docs

* fix doctest and added tests

* pathlib dirpath

* dep version and docs

* try fix doctest

* pep

* suggestions
Co-authored-by: carmocca <carlossmocholi@gmail.com>

* suggestions

* fix test

* pep

* trigger tests

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* suggestions

* try fix windows test

* add and update some tests

* trigger tests

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-10-23 09:59:12 +05:30
William Falcon 45d05ff68d
Fixes #4141 (#4169)
* fix val epoch agg

* fix val agg metrics

* fix val agg metrics

* fix val agg metrics
2020-10-15 09:12:05 -04:00
William Falcon 09c2020a93
notices (#4118) 2020-10-13 07:18:07 -04:00
William Falcon 7ffe05a3d1
ref: accelerator names (#4066)
* ref: accelerator names

* docs
2020-10-11 01:05:14 -04:00
Ananya Harsh Jha ae8772490d
classification metrics (#4043)
* docs + precision + recall + f_beta + refactor

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>

* rebase

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>

* fixes

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>

* added missing file

* docs

* docs

* extra import

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
2020-10-10 12:31:00 -04:00
Nrupatunga fcfa587492
Bugfix/update trainer properties (#3975)
* make current_epoch and global_step to be same as trainer, after model restore.

* remove assignment here

* test

* minor modification

* merge with parent's master

* [bug-fix]: update trainer properties

* minor comment fix

* minor comment fix

* reset train loader in `on_train_epoch_start` hook

* makes sure the changes work

* minor chane

* update changelog

* adding unit test for reload_dataloaders_every_epoch arg

* modified changelog, to add PR number

* revert imports

* changes to unit test

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-08 10:20:55 -04:00
Ananya Harsh Jha 6f1a2ce517
integrate metrics API with self.log (#3961)
* metrics integration into self.log

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>

* ddp and regualr test for self.log + metrics

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>

* pep8

* fix log tests

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>

* docs

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
2020-10-07 22:54:32 -04:00
William Falcon 838940eee7
removing this troubling test that has random behavior (#3941)
* threshold

* threshold
2020-10-07 12:01:51 -04:00
William Falcon 71a4c61f6e
fixes #3871 (#3919)
* fixes #3871

*  tests

*  tests

*  tests

*  tests

*  tests

*  tests

*  tests

* moves sync bn to each backend

* moves sync bn to each backend

Co-authored-by: nateraw <nxr9266@g.rit.edu>
2020-10-06 22:56:34 -04:00
Nathan Raw 1954d7c87a
Write predictions in LightningModule instead of EvalResult (#3882)
*  add self.write_prediction

*  add self.write_prediction_dict to lightning module
2020-10-05 18:04:02 -04:00
William Falcon 0fb8c54fda
remove deprecated test (#3820) 2020-10-03 13:21:10 -04:00
William Falcon d9bc95f83e
ref: bug fix with logging val epoch end + monitor (#3812)
* ref: fix metric err

* ref: fix metric err

* ref: fix metric err

* ref: merge

* ref: merge

* ref: merge

* ref: merge

* ref: decoupled ddp2

* ref: decoupled ddp2

* ref: decoupled ddp2

* ref: decoupled ddp2

* ref: decoupled ddp2

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix
2020-10-03 12:33:29 -04:00
William Falcon a38d108a68
add dist lib to enable syncing anything across devices (#3762)
* add dist lib to enable syncing anything across devices
2020-10-01 01:21:38 -04:00
ananthsub 3dcf7130c5
Support checkpoint hooks on data module (#3563)
* Split out changes from #3563 to make that PR easier to review. This formats the file according to the Black formatter

* Store a reference to the trainer on the datamodule

Fixes #3682

* Update data_connector.py

* Update data_connector.py

* Update test_datamodules.py

* Split out changes from #3563 to make that PR easier to review. This formats the file according to the Black formatter

* support checkpoint hooks for datamodule

refactor on_{save/load}_checkpoint to a separate hook class that both the lightning module and data module inherit
add spots in callback connector to call new datamodule hooks if available

* hooks formatting

* Update hooks.py

* Update checkpoint_connector.py

* Update lightning.py

* update based on upstream/master

checkout upstream/master

* Update checkpoint_connector.py

* add tests

* undo format revert

* Updated CHANGELOG.md

* add checkpoint hooks

* add Dict type

* import CheckpointHooks
2020-09-29 19:51:44 +02:00