Commit Graph

230 Commits

Author SHA1 Message Date
Jirka Borovec c2c82dad62
CI: Azure (#5882)
* add base Azure pipeline

* skip
2021-02-10 04:43:26 -05:00
Jirka Borovec 9dd56398e3
fixing some compatibility with PT 1.8 (#5864)
* change default

* .

* p

* 0.21.2

* .

* fix

* .
2021-02-09 18:25:57 +01:00
Jirka Borovec a0f7831278
fix miss-leading imports in tests (#5873)
* fix imorts

* .
2021-02-09 05:10:52 -05:00
rohitgr7 bcb6ee5d51 sync 2021-02-08 20:22:39 +01:00
Rohit Gupta cb67e1d0b2 Separate epoch validation from step validation (#5208)
* Seperate epoch validaton from step validation

* update system

* test

* baked logic in callbacks

* unbake logic in callbacks

* fix the call for scheduler

* use property

* pep

* correct rebase

* gitignore

* ref

* add tests

* fix

* add early stopping test

* trigger

* chlog

* rev

* 1.3

* log

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update pytorch_lightning/trainer/training_loop.py

* Update CHANGELOG.md

* Apply suggestions from code review

Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

(cherry picked from commit e429f97b67)
2021-02-08 20:22:39 +01:00
Jirka Borovec bd920b4102
Refactor simplify tests (#5861)
* add new

* restructure

* yapf

* move

* fix
2021-02-08 11:52:02 +01:00
Jirka Borovec 42812bb003
prune SimpleModel (#5862) 2021-02-08 09:52:54 +01:00
Jirka Borovec 4faaef7758
formatting tests: 4/n (#5846)
* models

* ckpt

* core

* log
2021-02-06 12:07:26 +01:00
Jirka Borovec f83cca6107
formatting flake8 & isort (#5824)
* formatting

* isort

* make

* yapf

* isort
2021-02-05 18:33:12 -05:00
Kaushik B 5dfd62c09e Disable training with zero num_training_batches when insufficient limit_train_batches (#5703)
* disable training when zero num_train_batches with limit_train_batches

* refactor train skip condition

* fix formatting issues

* fix formatting issues

* ref: test error msg

* fix tests for data loader calls

* fix train dataloader condition

* update limit_train_batches upper range in test comment

* remove model state check test

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-05 21:40:42 +01:00
Sumanth Ratna 1c44f35cf3 Fix mypy 0.800 plus when prepending $PYTHONPATH to sys.path (#5698)
* Fix mypy when prepending $PYTHONPATH to sys.path

* attempt mypy fix

* Revert "attempt mypy fix"

This reverts commit fb7ed827d9.

* fix mypy

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-05 21:40:40 +01:00
Adrian Wälchli bb7d188318 Fix ModelCheckpoint race condition in file existence check (#5155)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-02-05 21:40:39 +01:00
Jirka Borovec d2c2e5004d fix tests 2021-02-04 20:55:58 +01:00
Swetha Mandava c62f68c7cd passing batch outputs to on_train_batch_end (#4369)
* passing batch outputs to on_train_batch_end

* styling

* updating epoch end logic

* also condition on on_train_epoch_end hooks

* more readable

* pep8

* pep8

* readability suggestion accepted

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* adding test_training_epoch_end_metrics_collection_on_override test

* fix formatting

* fix formatting

Co-authored-by: Swetha Mandava <smandava@nvidia.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>

(cherry picked from commit 5fcca4e43b)
2021-02-04 20:55:41 +01:00
Lexie Troiano d05cdf83f1 Merge remote-tracking branch 'carmocca/sync-1.1.5' into release/1.2-dev 2021-02-04 09:42:59 -05:00
Kaushik B 26cc3b5357
Change the seq of on_train_batch_end, on_batch_end & on_train_epoch_end, on_epoch_end hooks (#5688) 2021-02-04 18:30:20 +05:30
Adrian Wälchli 9555043a29
Force ModelCheckpoint callback to run last (#5731) 2021-02-03 16:40:57 -05:00
Rohit Gupta 293984bbc0 fix reinit_schedulers with correct optimizer (#5519)
* update test

* syntax

* fix

* update test

* scheduler

* only apex

* fix

* rev drone

* chlog
2021-02-03 19:41:45 +01:00
Rohit Gupta 10c7dbe6a1 Refactor setup_training and remove test_mode (#5388)
* ref and fix call for on_pretrained_routine

* avoid failing tests

* unnecessary_call

* unnecessary call in accelerators

* tmpdir

* rm test_mode

* pep

* updates

* more ref

* Revert "more ref"

This reverts commit 5d9e95f873.

* more refac

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-02-03 19:41:42 +01:00
Adrian Wälchli 692f77b8a7
Refactor LightningDataParallel (#5670)
* module

* fix model access

* scalar conversion

* refactor

* kwargs

* auto unsqueeze

* refactor code duplication

* clean up

* docs

* update dp docs

* changelog

* generalize test

* test

* rename

* warning cache

* isort

* unsqueezing test

* device

* device

* scalar test

* device

* device

* include coverage of overrides

* clear

* add deprecation test

* docs

* improve coverage

* increase coverage

* fix merge

* extend test

* rename base class

* mention the predict method in docs

* combine iteration over collection

* remove override

* move

* line

* Apply suggestions from code review

* fix running stage

* f401

* fix cyclic import

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-01-31 06:08:16 -05:00
chaton 3da28fd634
[feat] 1/2 Add trainer.predict (#5579)
* start adding predict

* add predict

* resolve test

* add predict

* remove limit_predict

* update

* add test for predict

* typo

* update on comments

* remove predict_step

* update ddp_shareded

* check ddp_sharded

* resolve on comments

* resolve isort

* update dp

* add test dp 1 gpu

* made default forward

* resolve path

* resolve bug

* update on comments

* resolve doc

* resolve bug

* update

* resolve bug

* update on comments

* resolve pep8

* update test doc

* update on comments

* solve special tests

* resolve bug

* resolve flake8

* Update pytorch_lightning/callbacks/progress.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* add predict to LightningModule

* missing predict

* typo

* rename is_prediction to _predicting

* add

* update

* update

* update doc

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-01-27 11:38:14 -05:00
Jirka Borovec 7e2e874d95
Refactor: legacy accelerators and plugins (#5645)
* tests: legacy

* legacy: accel

* legacy: plug

* fix imports

* mypy

* flake8
2021-01-26 20:04:36 -05:00
Jirka Borovec 53b0ae49b9 fix imports / isort / flake8 2021-01-26 14:57:34 +01:00
SeanNaren df3c170b2c Fix imports & issues in lightning optimizer refactor merge 2021-01-26 14:29:47 +01:00
SeanNaren a80e37b95b Add hydra experimental to correct location 2021-01-26 14:29:47 +01:00
chaton 8e75f2cde0 bugfix: Resolve interpolation bug with Hydra (#5406)
* resolve bug

* Apply suggestions from code review

* resolve package import

* resolve import

* update on comments

* update on comments

* hacky fix

* update

* exit

* update

* to_container

* typo

* resolve import

* update

* resolve pep8

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit bb5031b3bf)
2021-01-26 14:29:47 +01:00
chaton 0435e23a64 deprecate enable_pl_optimizer as it is not restored properly (#5244)
* update

* clean test

* still in progress

* udpdate test

* update

* update

* resolve flake

* add test for zero_grad

* update

* works without accumulated_grad

* update

* update

* resolve amp

* revert back to True

* update

* clean tests

* cleaned out

* typo

* update test

* git repare bug

* remove print

* udpate

* Fix formatting/optimizer imports

* Refactor the test for cleanliness

* Add vanilla model to the test, better var names

* Fixed var names, let's clean up these mock tests

* repare test

* update test

* resolve flake8

* add manual_optimization

* update tests

* resolve flake8

* add random accumulate_grad_batches

* improve test

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update

* clean tests

* correct bug

* Apply suggestions from code review

* format

* adress comments

* update on comments

* wip

* typo

* depreceate enable_pl_optimizer

* resolve latest bugs

* update

* resolve merge

* add comment

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/deprecated_api/test_remove_1-3.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/connectors/optimizer_connector.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update on comments

* update restore

* add a property

* remove setstate as not needed anymore

* update test

* provide optimizer to on_before_zero_grad

* update on comments

* update on comments

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* mofidy import

* update changelog

* resolve flake8

* update

* update

* clean doc

Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit f2e99d617f)
2021-01-26 14:29:46 +01:00
chaton f2f4a49271 [bug-fix] Call transfer_batch_to_device in DDPlugin (#5195)
* hacking out

* update

* remove useless on_before_forward

* update

* remove overriden

* iremove os

* use on_before_forward

* resolve flake8

* add test

* update

* add single_process_per_device

* resolve flake8

* update

* resolve

* update

* update

* update

* add comment

* resolve bug with sharded

* update

* remove property

* update

* resolve test

* resolve bug

* update on comments

* update doc

* Update pytorch_lightning/core/hooks.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* update on comments

* Update pytorch_lightning/plugins/ddp_plugin.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update pytorch_lightning/plugins/ddp_plugin.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* resolve pep8

* add device_ids to pipe

* update on comments

* update

* resolve

* update

* update

* update

Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit d510707bc9)
2021-01-26 14:28:45 +01:00
Jirka Borovec 2fe36c7049
simple tests restructure (#5452)
* simple tests restructure

* logging_process

* typo
2021-01-15 20:58:20 -05:00
Arnaud Gelas ac531ec945
Fix pre-commit isort failure on tests/models/*.py (#5423)
* Remove tests.models from skipped module in pyproject.toml

* Fix pre-commit isort failure on tests/models/*.py
2021-01-14 09:42:01 -05:00
Adrian Wälchli 61308138c3
set find_unused_parameters=False in DDP as in pytorch (#5435)
* set find unused params to False

* add changelog

* fix changelog

* fix test

* update docs

* update changelog

Co-authored-by: chaton <thomas@grid.ai>
2021-01-13 10:13:40 -05:00
Rohit Gupta 1323cb2ed5
Add missing val/test hooks in LightningModule (#5467)
* add missing val/test hooks

* chlog

* None

Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2021-01-13 01:09:47 -05:00
Jirka Borovec 059f4630c8
prune check on Trainer fit result (#5453)
* prune check on Trainer fit result

* flake8

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* .

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-01-11 19:36:48 -05:00
Gianluca Scarpellini 7464aca44e
test_cpu and test_gpu EvalModelTemplate deprecation (#4820)
* test_cpu refactoring - BoringModel and checkpoints; test_gpu refactoring - BoringModelboring_model refactoring - validation, testing; Fix - run_prediction as dispatcher for testing BoringModel

* Removed EvalModelTemplate import from test_cpu and test_gpu

* Reverting unintended changes

* Issues with checkpointing

* Fixed tests for logging and checkpointing

* Fix for dispatcher

* test_cpu refactoring - BoringModel and checkpoints; test_gpu refactoring - BoringModelboring_model refactoring - validation, testing; Fix - run_prediction as dispatcher for testing BoringModel

* Removed EvalModelTemplate import from test_cpu and test_gpu

* Reverting unintended changes

* Issues with checkpointing

* Fixed tests for logging and checkpointing

* Fix for dispatcher

* Fixed acc check for stocasticity of seeds

* Fixed according to @borda suggestions

* Hparams for boring_model

* Deprecated RuntimeParamChagneModelAssing (functionality is tested in RuntimeParamChangeModelSaving)

* Reduced boring_model parameters to just in and out features, test_cpu modelsinherit BoringModel to specify additional parameters (e.g., optimizer)

* Fix PEP8

* Update tests/base/develop_pipelines.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/base/boring_model.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/base/develop_pipelines.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/models/test_cpu.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/models/test_cpu.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Merged test_early_stopping with all_features; added TODO for self.log

* Fixed test_all_features trainer options

* Ready for review!

* Update tests/models/test_cpu.py

Thank you! :)

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/models/test_cpu.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/models/test_cpu.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/models/test_cpu.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/models/test_cpu.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* added optimizer_name, lr, and batch_size as hparams for save_hparameters()

* Fixes for reducing PR size

* Reverse test_hparams (removed DEPRECATED test for hparams direct assignment)

* Changes for in_features

* Fixed hparams

* Fixed parameters for boring_model

* Update tests/models/test_cpu.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update tests/models/test_cpu.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update tests/models/test_cpu.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* fix for pep8

* Fixed run_predction and TODO

* fix min acc for darwin/windows without pl_opt

* eval as DEFAULT run_prediction strategy

* Updated val_dataloader for running_test_no_val

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-01-07 05:50:08 -05:00
tarepan bb366232e7 Add non-existing resume_from_checkpoint acceptance for auto-resubmit (#4402)
* Add empty resume_from_checkpoint acceptance #4366

* Fix general error catch with focused file check

* Add fsspec HTTP extras

Add fsspec's HTTPFileSystem  support through http extras.
pl has supported remote http file (e.g. #2925),
so this commit do not add new functionality.

* Fix potential too much logging in DDP

* Add PR changelog

* Add well-written argument explanation

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Fix DDP-compatible restore logging

Notify from where the states are restored.
This feature temporally deleted as a result of PR review.
With succeeding review, added with DDP compatibility.

* Fix utility import pathes

* Refactor load step commentaries

* Refactor hpc ckpt suffix acquisition

* Refactor restore/hpc_load match

* Refactor hpc load trial

* Refactor checkpoint dir check

* Refactor unneeded function nest

* Refactor nested If

* Refactor duplicated cache clear

* Refactor attempt flow with if/elif

* Fix pip8

* Refactor hook commentary

Co-authored-by: chaton <thomas@grid.ai>

* Fix pep8

* Refactor hpc load checkpoint path acquisition

* Fix pip8

* Fix typo

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Fix typo

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Fix doc

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Refactor None Union type with Optional

* Fix build-doc CI failure debuged in #5329

* Fix fsspec import during build-doc #5329

* Fix test epoch

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Fix test with latest test models

* .

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>

(cherry picked from commit b0051e8c03)
2021-01-06 12:55:38 +01:00
Jirka Borovec 74d0652164 flake8 ++ 2021-01-05 09:58:37 +01:00
Adrian Wälchli cc14fc16bf skip multi-gpu test when running on single-gpu machine (#5186)
* skip test

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-01-05 09:58:37 +01:00
Jirka Borovec af833f673c
drop deprecated TrainResult (#5323)
* drop TrainResult

* .

* .

* .

* .

* .

* .
2021-01-04 09:54:21 +08:00
Jirka Borovec 0f36525e8f
fix/enable - check F401 (#5201)
* refactor - check F401

* missed

* fix
2020-12-21 10:15:04 +01:00
Jirka Borovec 6d2c564bc6
refactor - check F841 (#5202) 2020-12-21 11:10:55 +05:30
Jirka Borovec 059eaecbb4
set xxx_AVAILABLE as protected (#5082)
* sett xxx_AVAILABLE as protected

* docs
2020-12-14 20:19:05 +05:30
tarepan 16feb5137b
Refactor load in checkpoint connector (#4593)
* Refactor load step commentaries

* Refactor hpc ckpt suffix acquisition

* Refactor restore/hpc_load match

* Refactor hpc load trial

* Refactor checkpoint dir check

* Refactor unneeded function nest

* Refactor nested If

* Refactor duplicated cache clear

* Refactor attempt flow with if/elif

* Fix pip8

* Refactor hook commentary

Co-authored-by: chaton <thomas@grid.ai>

* Fix pep8

* Refactor hpc load checkpoint path acquisition

* Fix pip8

* Fix doc

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Refactor None Union type with Optional

Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-12-14 00:13:50 +08:00
Jirka Borovec a49291d98d
drop unused test with result api (#5058)
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-12-12 21:51:19 +05:30
Rohit Gupta 3100b7839a
Allow any input in to_onnx and to_torchscript (#4378)
* branch merge

* sample

* update with valid input tensors

* pep

* pathlib

* Updated with BoringModel and added more input types

* try fix

* pep

* skip test with torch < 1.4

* fix test

* Apply suggestions from code review

* update tests

* Allow any input in to_onnx and to_torchscript

* Update tests/models/test_torchscript.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* no_grad

* try fix random failing test

* rm example_input_array

* rm example_input_array

Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
2020-12-12 18:17:03 +08:00
Jirka Borovec 05f25f3a54
update usage of deprecated checkpoint_callback (#5006)
* drop usage of deprecated checkpoint_callback

* fix

* fix
2020-12-09 14:14:34 -05:00
Jirka Borovec 53d7c9555c
drop usage of deprecated distributed_backend (#5009)
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-12-09 09:18:23 +01:00
Jirka Borovec ab7c947961
simplify CI horovod (#4951)
* simplify CI horovod

* reorder
2020-12-07 10:31:33 +01:00
Jirka Borovec 3976db597d
refactor imports of optional dependencies (#4859)
* refactor imports of optional dependencies

* fix

* fix

* fix

* fix

* fix

* flake8

* flake8

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-12-04 10:26:10 +01:00
Lezwon Castelino 12cb9942a1
Tpu save (#4309)
* convert xla tensor to cpu before save

* move_to_cpu

* updated CHANGELOG.md

* added on_save to accelerators

* if accelerator is not None

* refactors

* change filename to run test

* run test_tpu_backend

* added xla_device_utils to tests

* added xla_device_utils to test

* removed tests

* Revert "added xla_device_utils to test"

This reverts commit 0c9316bb

* fixed pep

* increase timeout and print traceback

* lazy check tpu exists

* increased timeout
removed barrier for tpu during test
reduced epochs

* fixed torch_xla imports

* fix tests

* define xla utils

* fix test

* aval

* chlog

* docs

* aval

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-02 13:05:11 +00:00
chaton c2e6e68c7e
optimizer clean up (#4658)
* add LightningOptimizer

* typo

* add mock closure

* typo

* remove logic in optimizer_step

* update

* update

* update

* desactivate LightningOptimizer for hovorod

* resolve flake

* typo

* check optimizer name

* change name

* added backward to LightningOptimizer

* remove use_lightning_optimizer

* move update

* simplify init

* resolve comments

* resolve bug

* update

* update

* resolve bugs

* resolve flake8

* set state

* work manual_optimizer_step

* add doc

* add enable_pl_optimizer

* make optimizer_step

* add make_optimizer_step

* add examples

* resolve test

* add test_optimizer_return_options_enable_pl_optimizer

* add enable_pl_optimizer=True

* update

* update tests

* resolve bugs

* update

* set Trainer to False

* update

* resolve bugs

* update

* remove from doc

* resolve bug

* typo

* update

* set to True

* simplification

* typo

* resolve horovod

* unwrap horovod

* remove Optimizer

* resolve horovod

* move logic to amp_backend

* doesn't seem to be pickable

* update

* add again

* resolve some bugs

* cleanup

* resolve bug with AMP

* change __repr__

* round at -12

* udpate

* update

* update

* remove from horovod

* typo

* add convert_to_lightning_optimizers in each accelerators

* typo

* forgot

* forgot a convert_to_lightning_optimizers

* update

* update

* update

* increase coverage

* update

* resolve flake8

* update

* remove useless code

* resolve comments + add support for LightningOptimizer base class

* resolve flake

* check optimizer get wrapped back

* resolve DDPSharded

* reduce code

* lightningoptimizer

* Update pytorch_lightning/core/optimizer.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update pytorch_lightning/core/lightning.py

* remove reference to step function

* Apply suggestions from code review

* update on comments

* resolve

* Update CHANGELOG.md

* add back training_step in apex and native_amp

* rename optimizer_step

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-01 00:09:46 +00:00