Commit Graph

37 Commits

Author SHA1 Message Date
Jirka Borovec 4faaef7758
formatting tests: 4/n (#5846)
* models

* ckpt

* core

* log
2021-02-06 12:07:26 +01:00
Adrian Wälchli bb7d188318 Fix ModelCheckpoint race condition in file existence check (#5155)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-02-05 21:40:39 +01:00
Jirka Borovec 7e2e874d95
Refactor: legacy accelerators and plugins (#5645)
* tests: legacy

* legacy: accel

* legacy: plug

* fix imports

* mypy

* flake8
2021-01-26 20:04:36 -05:00
Jirka Borovec 53b0ae49b9 fix imports / isort / flake8 2021-01-26 14:57:34 +01:00
chaton 0435e23a64 deprecate enable_pl_optimizer as it is not restored properly (#5244)
* update

* clean test

* still in progress

* udpdate test

* update

* update

* resolve flake

* add test for zero_grad

* update

* works without accumulated_grad

* update

* update

* resolve amp

* revert back to True

* update

* clean tests

* cleaned out

* typo

* update test

* git repare bug

* remove print

* udpate

* Fix formatting/optimizer imports

* Refactor the test for cleanliness

* Add vanilla model to the test, better var names

* Fixed var names, let's clean up these mock tests

* repare test

* update test

* resolve flake8

* add manual_optimization

* update tests

* resolve flake8

* add random accumulate_grad_batches

* improve test

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update

* clean tests

* correct bug

* Apply suggestions from code review

* format

* adress comments

* update on comments

* wip

* typo

* depreceate enable_pl_optimizer

* resolve latest bugs

* update

* resolve merge

* add comment

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/deprecated_api/test_remove_1-3.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/connectors/optimizer_connector.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update on comments

* update restore

* add a property

* remove setstate as not needed anymore

* update test

* provide optimizer to on_before_zero_grad

* update on comments

* update on comments

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/optimization/test_parity_automatic_optimization.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* mofidy import

* update changelog

* resolve flake8

* update

* update

* clean doc

Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit f2e99d617f)
2021-01-26 14:29:46 +01:00
Arnaud Gelas ac531ec945
Fix pre-commit isort failure on tests/models/*.py (#5423)
* Remove tests.models from skipped module in pyproject.toml

* Fix pre-commit isort failure on tests/models/*.py
2021-01-14 09:42:01 -05:00
Jirka Borovec 059f4630c8
prune check on Trainer fit result (#5453)
* prune check on Trainer fit result

* flake8

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* .

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-01-11 19:36:48 -05:00
Jirka Borovec 74d0652164 flake8 ++ 2021-01-05 09:58:37 +01:00
Jirka Borovec 0f36525e8f
fix/enable - check F401 (#5201)
* refactor - check F401

* missed

* fix
2020-12-21 10:15:04 +01:00
Jirka Borovec 059eaecbb4
set xxx_AVAILABLE as protected (#5082)
* sett xxx_AVAILABLE as protected

* docs
2020-12-14 20:19:05 +05:30
Jirka Borovec 53d7c9555c
drop usage of deprecated distributed_backend (#5009)
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-12-09 09:18:23 +01:00
Jirka Borovec 3976db597d
refactor imports of optional dependencies (#4859)
* refactor imports of optional dependencies

* fix

* fix

* fix

* fix

* fix

* flake8

* flake8

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-12-04 10:26:10 +01:00
chaton c2e6e68c7e
optimizer clean up (#4658)
* add LightningOptimizer

* typo

* add mock closure

* typo

* remove logic in optimizer_step

* update

* update

* update

* desactivate LightningOptimizer for hovorod

* resolve flake

* typo

* check optimizer name

* change name

* added backward to LightningOptimizer

* remove use_lightning_optimizer

* move update

* simplify init

* resolve comments

* resolve bug

* update

* update

* resolve bugs

* resolve flake8

* set state

* work manual_optimizer_step

* add doc

* add enable_pl_optimizer

* make optimizer_step

* add make_optimizer_step

* add examples

* resolve test

* add test_optimizer_return_options_enable_pl_optimizer

* add enable_pl_optimizer=True

* update

* update tests

* resolve bugs

* update

* set Trainer to False

* update

* resolve bugs

* update

* remove from doc

* resolve bug

* typo

* update

* set to True

* simplification

* typo

* resolve horovod

* unwrap horovod

* remove Optimizer

* resolve horovod

* move logic to amp_backend

* doesn't seem to be pickable

* update

* add again

* resolve some bugs

* cleanup

* resolve bug with AMP

* change __repr__

* round at -12

* udpate

* update

* update

* remove from horovod

* typo

* add convert_to_lightning_optimizers in each accelerators

* typo

* forgot

* forgot a convert_to_lightning_optimizers

* update

* update

* update

* increase coverage

* update

* resolve flake8

* update

* remove useless code

* resolve comments + add support for LightningOptimizer base class

* resolve flake

* check optimizer get wrapped back

* resolve DDPSharded

* reduce code

* lightningoptimizer

* Update pytorch_lightning/core/optimizer.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update pytorch_lightning/core/lightning.py

* remove reference to step function

* Apply suggestions from code review

* update on comments

* resolve

* Update CHANGELOG.md

* add back training_step in apex and native_amp

* rename optimizer_step

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-01 00:09:46 +00:00
Jirka Borovec 11e73ceaa6
fix import and typo in AMP (#4871)
* fix import and typo

* docs

* apex

* fix

* typo
2020-11-26 23:45:52 +01:00
Travis Addair 51cc7a89ee
Horovod: fixed early stopping and added metrics aggregation (#3775)
* Fixed early stopping for Horovod

* Refactored to sync_dist_if_available

* Bump min Horovod version to support hvd.is_initialized

* Changelog

* Added back change for Horovod

* Removed redundant checks for initialization

* Implement metrics gathering for Horovod

* Added test for EvalResult

* Renamed ddp_sync_on_step -> dist_sync_on_step

* Added metric test for Horovod

* Added option pass callable allgather function to metric base class

* Added dist_sync_fn

* Fixed calls to private _sync_dist

* Fixed Horovod test

* Added sync_tensor to the distributed backend

* Skip Windows

* Insert test path

* Removed redundant import

* Updated drone

* Unset HOROVOD_GPU_ALLREDUCE

* Unset

* No cache dir

* No uninstall

* Unset variables

* Uninstall Horovod during initialization

* Replaced more references to ddp_sync_on_step

* Fixed imports

* Fixed attribute

* Added back default

* Lint

* Added back docstring

* Made gather_all_tensors default

* Added whitespace

* Update tests/models/test_horovod.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/metric.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update CHANGELOG.md

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-05 12:52:02 -05:00
Jeff Yang ee414d25be
Switch to PyTorch 1.6 in Drone CI (#4393)
* switch to 1.6

* readme

* 1.7

* back to normal [ci skip]

* horovodrun --verbose

* try with apex

* add apex test

* change base

* description

* test with 1.7

* back to 1.6

* no gradient_clip_val

* re-add gradient_clip_val

* no amp

* temp skip torch.cuda.amp + horovod test

* Apply suggestion from code review

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Fix formatting

* ddp

* Moved extended model outside of function to prevent pickling issue for drone

* typo

* resolve bug

* extract automatic_automization

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-03 18:01:51 +00:00
William Falcon 09c2020a93
notices (#4118) 2020-10-13 07:18:07 -04:00
Adrian Wälchli f37e9e8a83
Fix global step increment on training_epoch_end (#3673)
* fix

* fix global step err

* fix global step err

* fix global step err

* fix global step err

* fix global step err

* fix global step err

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-09-27 20:19:51 -04:00
William Falcon 8f6b115511
ref: added model connector (#3407)
* ref: added model connector

* ref: added model connector

* ref: added model connector
2020-09-09 00:24:20 -04:00
Travis Addair 091d37f968
Added check for apex AMP and unit tests for Horovod + AMP (#3404)
* Added check for apex AMP and unit tests for Horovod + AMP

* Changelog

* Fixed order of Horovod and Apex optimizer wrapping
2020-09-08 20:30:57 -04:00
Adrian Wälchli 4ad5a78dce
to_torchscript method for LightningModule (#3258)
* script

* docs

* simple test

* move test

* fix doctest

* no grad context

* extend tests


test


test

* datamodule test

* clean up test

* docs

* name

* fix import

* update changelog

* fix import

* skip pytorch 1.3 in test

* update codeblock

* skip bugged 1.4

* typehints

* doctest not working on all pytorch versions

* rename TestGAN to prevent pytest interference

* add note about pytorch version

* fix torchscript version inconsistency in tests

* reset training state + tests

* update docstring

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* update docstring, dict return

* add docs to index

* add link

* doc eval mode

* forward

* optional save to file path

* optional

* test torchscript device

* test save load with file path

* pep

* str

* Commit typing suggestion

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* skip test if cuda not available

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2020-09-03 20:24:44 +02:00
Jirka Borovec ed3ee982b3
clean tests imports (#2834) 2020-08-06 16:58:51 +02:00
Jirka Borovec 590e7fb1fd
tests: add default_root_dir=tmpdir (#2392)
* tests: add default_root_dir=tmpdir

* remove duplicate tmpdir args

* add missing fixture

* test requires multi gpu

* typo

* resize

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-07-28 09:47:53 -04:00
Travis Addair 1369012bc7
Horovod: adjust base LR used by schedulers to scale with the number of workers (#2626)
* Horovod: Adjust base LR used by schedulers to match that of the optimizer after scaling by number of workers

* Added unit test

* Removed debug statements

* Updated changelog

* Apply suggestions from code review

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-23 12:14:57 -04:00
Adrian Wälchli 78db847e42
Fixed skipped horovod tests (#2514)
* skip ckpt test on rank  > 0

* fx test

* add extra assert

* code factor

* add back removed

* add old loading code

* add back old

* unused import

* add same skip to run_model_without_loggers

* test if horovod now works with python 3.8

* test remove all 3.8 skips

* remove spawn

* fix

* fix test

* move load check up

* fix test multigpu

* rename

* fix gpu mode

* on gpu fix when on cpu

* move
2020-07-07 14:54:07 -04:00
William Falcon 11069c8784
Fix ddp tests + .test() (#2512)
* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* fix deprecation warnings

* added base tests for tpu

* added base tests for tpu

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-07-07 12:24:56 -04:00
Adrian Wälchli 145670f893
fix logging on rank 0 only (#2425)
* fix and test for ddp block logging rank > 0

* rename

* use the dummy logger

* dummy logger test

* set the logger in  model

* decorator for rank zero experiment

* simplify check

* simplify

* fix problem with None in checkpoint path

* revert configure logger

* unused import

* offline

* try rank 0 decorator in checkpoint

* try fix test

* imgs

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* fix tpu tests

* fix tpu tests

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-30 18:09:16 -04:00
Jirka Borovec f1c96930b1
repair CI for Win (#2358)
* no cov

* no cov

* ReduceOp

* group

* reduce_op.sum

* Update sklearns.py

* formatting

* horovod

* Apply suggestions from code review

* horovod

* horovod

* horovod

* horovod

* ci

* print

* ci

* timeout

* timeout

* time

* fix

* distributed cpu

* pipes

* time

* cpu

* spawn

* spawn

* spawn

* tp

* separate

* os

* os

* npm

* Fix load_from_checkpoint() not working with URL on Windows

* Update CHANGELOG

* Update CHANGELOG.md

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* Apply suggestions from code review

* fix

* fix meta tags creating empty lines

* pyright

* node

* fix httpserver address

* drop tutils.default_trainer_options

* imports

* Better fix for load_from_checkpoint() not working with absolute path on Windows (#2294)

* Fix load_from_checkpoint() not working with URL on Windows

* Update CHANGELOG

* Update CHANGELOG.md

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* drop duplicate

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: airium <airium@outlook.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: AIRIUM <38249940+airium@users.noreply.github.com>
2020-06-26 21:38:25 -04:00
William Falcon 2411c3be70
replace train_percent_check with limit_train_batches (#2220)
* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* chlog

* deprecated

* deprecated

* deprecated

* tests

* tests

* Apply suggestions from code review

* tests

* hydra support

* tests

* hydra support

* hydra support

* hydra support

* tests

* typo

* typo

* Update test_dataloaders.py

* docs

* docs

* docs

* docs

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-17 13:42:28 -04:00
William Falcon 04c794ca72
[WIP] Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) (#2213)
* fixed percent check for val/test

* fixed percent check for val/test

* fixed percent check for val/test

* fixed percent check for val/test

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* add on fit_start on fit_end hooks

* add on fit_start on fit_end hooks

* add on fit_start on fit_end hooks

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-17 08:03:28 -04:00
William Falcon caa9c6760b
replace Hparams by init args (#1896)
* remove the need for hparams

* remove the need for hparams

* remove the need for hparams

* remove the need for hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* finished moco

* basic

* testing

* todo

* recurse

* hparams

* persist

* hparams

* chlog

* tests

* tests

* tests

* tests

* tests

* tests

* review

* saving

* tests

* tests

* tests

* docs

* finished moco

* hparams

* review

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* hparams

* overwrite

* transform

* transform

* transform

* transform

* cleaning

* cleaning

* tests

* examples

* examples

* examples

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* chp key

* tests

* Apply suggestions from code review

* class

* updated docs

* updated docs

* updated docs

* updated docs

* save

* wip

* fix

* flake8

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-24 18:59:08 -04:00
kumuji 619f984c36
Option to provide seed to random generators to ensure reproducibility (#1572)
* Option to provide seed to random generators to ensure reproducibility

I added small function in utilities which imports torch, numpy, python
random and sets seed for all of the libraries to ensure reproducibility
of results.

* Apply recommendations from core contributors on seeding

1. Moved the seeding code to another file
2. Make deterministic as a parameter for trainer class
3. Add assertions for seeding numpy
4. Added warnings
5. torch.manual_seed should be enough for seeding torch

* Revert "Apply recommendations from core contributors on seeding"

This reverts commit a213c8e6882eec8a9e7408b9418926d2db7c5461.

* Revert "Revert "Apply recommendations from core contributors on seeding""

This reverts commit 59b2da53c62878de7aab0aa3feb3115e105eea06.

* Change in test, for correct seeding

* Allow seed equal to 0

* Allow seed to be uint32.max

* Added deterministic to benchmarks

* Cuda manual seed as in benchmark seeding

* Seeding should be done before model initialization

* cuda manual_seed is not necessary

* Fixing seed test_cpu_lbfgs

On some seeds seems like lbfgs doesn't converge.
So I fixed the seed during testing.

* rebasing issue with old reproducibility.py

* Improved documentation and ability to seed before initializing Train
class

* Change in docs

* Removed seed from trainer, update for documentation

* Typo in the docs

* Added seed_everything to _all_

* Fixing old changes

* Model initialization should be earlier then Trainer

* Update pytorch_lightning/trainer/__init__.py

From Example to testcode

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Fixing according to the contributors suggestions

* Moving horovod deterministic to Trainer class

* deterministic flag affects horovod docs update

* Improved static typing

* Added deterministic to test runners of horovod

It is failing on some versions, not very predictable

* static seeds for horovod tests

* Change for reset_seed function in tests

* Seeding horovod using reset_seed from tutils

* Update pytorch_lightning/trainer/__init__.py

* chlog

* Update trainer.py

* change "testcode" to "Example" in trainer init documentation

* Update pytorch_lightning/trainer/seed.py, first line in comment

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-12 07:53:20 -04:00
Jirka Borovec 134eb61e1a
Tests: refactor cleanup (#1744)
* wip

* cleaning

* optim imports

* -

* default hparams

* fix restore

* fix imports
2020-05-10 13:15:28 -04:00
Jirka Borovec 1077159834
Tests: refactor models (#1691)
* refactor default model

* drop redundant seeds

* drop redundant seeds

* refactor models tests

* refactor models tests

* imports

* fix conf

* Apply suggestions from code review
2020-05-04 11:38:08 -04:00
Jirka Borovec f380027951
refactor default model (#1652)
* refactor default model

* drop redundant seeds

* formatting

* path

* formatting

* rename
2020-05-02 08:38:22 -04:00
Travis Addair 2950f66983
Fix Horovod distributed backend to set the root_gpu property (#1669)
* params

* drop acc

* Fix Horovod distributed backend to set the root_gpu

* Fixed test

* Fixed tests

* Fixed lint

* Set root_gpu during initialization

* chlog

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-01 14:13:35 -04:00
Travis Addair 7024177f7d
Added Horovod distributed backend (#1529)
* Initial commit of Horovod distributed backend implementation

* Update distrib_data_parallel.py

* Update distrib_data_parallel.py

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Fixed tests

* Added six

* tests

* Install tox for GitHub CI

* Retry tests

* Catch all exceptions

* Skip cache

* Remove tox

* Restore pip cache

* Remove the cache

* Restore pip cache

* Remove AMP

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-22 17:39:08 -04:00