Commit Graph

791 Commits

Author SHA1 Message Date
Nicki Skafte cefc7f7c32
Feature/log computational graph (#3003)
* add methods

* log in trainer

* add tests

* changelog

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* text

* added argument

* update tests

* fix styling

* improve testing
2020-08-19 19:08:46 -04:00
Adrian Wälchli 7b917de946
fix setting batch_size attribute in batch_size finder (finishing PR #2523) (#3043)
* lightning attr fix

* revert refactor

* create test

* separate test

* changelog update

* tests

* revert

* Update pytorch_lightning/trainer/training_tricks.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-19 19:01:55 -04:00
Justus Schock 7358d456f3
Retrieve last logged val from result by key (#3049)
* return last logged value

* Update test_results.py

* Update step_result.py

* Update step_result.py

* pep8

* pep8
2020-08-19 18:59:14 -04:00
Adrian Wälchli 89a5d8fee9
fix auto scale batch size not working with precision=16 (#3045)
* add test

* test

* test

* add fix

* changelog

* check batch size changed
2020-08-19 20:41:33 +00:00
Adrian Wälchli 9031dc3b81
Fix result gathering with varying tensor shapes (#3020)
* test for gethering results

* fix gather

* document tests

* changelog

* assert dtype

* default to concat

* additional test
2020-08-18 20:27:48 -04:00
William Falcon 8315a65d0a
fix result obj dp auto reduce (#3013)
* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* added warning when changing monitor and using results obj
2020-08-17 10:29:39 -04:00
William Falcon 51de6802ed
added warning when changing monitor and using results obj (#3014)
* added warning when changing monitor and using results obj

* added warning when changing monitor and using results obj

* added warning when changing monitor and using results obj
2020-08-17 10:29:28 -04:00
William Falcon 465d4ffd2c
added lr scheduler test using dev debugger (#3004)
* added lr scheduler test using dev debugger

* added lr scheduler test using dev debugger

* added lr scheduler test using dev debugger
2020-08-16 11:37:38 -04:00
Adrian Wälchli 188e06c261
ddp fix for trainer.test() + add basic ddp tests (#2997)
* add ddp script variations

* add ddp test

* rename

* shell

* test

* test

* try call

* try without subprocess

* test

* display the error

* list all variations

* try string

* try copy env

* debug

* pythonpath

* path

* update test

* change

* simple ddp test

* replace

* remove random port

* random port

* str

* clean up

* check run spawn

* clean up

* docs

* docs

* update test

* docs

* changelog

* changelog
2020-08-16 11:19:57 -04:00
William Falcon d702d4d393
removed callback metrics from test results obj (#2994)
* removed callback metrics from test results obj

* removed callback metrics from test results obj
2020-08-15 21:45:41 -04:00
William Falcon 766d0f391b
re-trigger build (#2988)
* fixed build

* fixed build
2020-08-15 21:13:00 -04:00
Jeff Yang 73ebd1066d
Fix accumulate_grad_batches for last batch (#2853)
* first attempt

* update changelog

* fix pep8 and tests

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* added new tests

* fixed tests

* Apply suggestions from code review

* used num_training_batches

* fixed pep8

* fixed with is_last_batch suggested by @awaelchli

* fixed with num_training_batches

* fixed with num_training_batches

* cleanup

* fix test and update docs

* fixed for alignment, update docs

* minor changes

* update doc

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-15 15:06:37 -04:00
William Falcon b8371fa56c
Fixes #2972 #2946 (#2986)
* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add step metrics

* add step metrics
2020-08-15 08:36:00 -04:00
Nathan Raw b9695237f1
Save test predictions on multiple GPUs (#2926)
* Save test predictions on multiple GPUs
2020-08-14 17:52:43 -04:00
Lezwon Castelino cfd06a083b
Bugfix/2956 tpu distrib backend fix (#2959)
* override dist backend when using tpus

* added test

* updated doc string

* drop redundant info...

* more redundant info

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-08-13 18:57:23 -04:00
shijianjian 18d31a3b63
Added strict=False for load_from_checkpoint (#2819)
* Added strict=False and hparams_file accepcts dict

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Type check fix

* Added tests

* Linting & test fix

* Removed redundant code & test

* Added strict=False and hparams_file accepcts dict

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Type check fix

* Added tests

* Linting & test fix

* Removed redundant code & test

* Apply suggestions from code review

* tests

* tests

* chlog

* Update tests/models/test_restore.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* update test comments

* Added docstring for the strict attribute

* Added supplementary tests

* Update saving.py

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* pep8, removed extra func

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai>
2020-08-13 16:25:43 -04:00
Jirka Borovec 4354690e55
add apex test (#2921)
* add apex test

* rename

* level

* events

* wrap

* evt

* miss

* apex

* apex

* apex

* apex

* apex

* apex

* Update tests/models/test_amp.py

Co-authored-by: William Falcon <waf2107@columbia.edu>

* notes

* notes

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-13 10:03:13 -04:00
Jirka Borovec 665c1507f0
deterministic=True (#2944) 2020-08-13 06:29:27 -04:00
Adrian Wälchli 411914bd2b
Fix hparams loading for model that accepts *args (#2911)
* fix hparams loading for model that accepts *args

* add test case

* changelog

* pep

* fix test

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-12 09:58:35 -04:00
William Falcon d13e5c9e53
document lightiningmodule better (#2920)
* updated docs
2020-08-11 19:39:43 -04:00
Adrian Wälchli 69d241c82e
Do not pass non_blocking=True if it does not support this argument (#2910)
* add docs

* non blocking only on tensor

* changelog

* add test case

* add test comment

* update changelog


changelog


chlog
2020-08-11 19:28:37 -04:00
William Falcon 28f79d9f7a
Mapkeys (#2900)
* added a map dict

* added a map dict
2020-08-09 18:50:39 -04:00
Adrian Wälchli 1ac507a255
constant root seed in reset_seed (tests) (#2895)
* fix root_seed in reset_seed

* seed value
2020-08-09 21:23:01 +00:00
Caldera 6c18fd9a24
Update lr_logger.py (#2847)
* Update lr_logger.py

when logging learning_rate, we should provide different choices to log including 'step' and 'epoch'

* Update lr_logger.py

add some type annotations and docstrings

* Update lr_logger.py

fixed a bug where `on_train_batch_start()` can't be triggered, instead, we should use on_batch_start(); add `interval` args so that we can record learning_rates with respect to `global_step` or `current_epoch`.

* Update lr_logger.py

restore _extract_lr()

* suggestion

* Update lr_logger.py

modify _extract_lr(), it no more need to pass `interval` parameter.

* Update test_lr_logger.py

SkafteNicki 's suggetion

* log_interval now supports `None`, `step`, `epoch`

* change `log_interval` to `logging_interval`

* Update test_lr_logger.py

* Update lr_logger.py

* put types check into `on_train_start()`

* cleanup

* docstring typos

* minor changes from suggestions

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-09 16:30:43 +00:00
Uladzislau Sazanovich e9846dd758
Add tracking of basic states in Trainer [wip - to-be-merged after v0.9] (#2541)
* Add initial tracking of states in Trainer.

* Add INTERRUPTED state, improve tests, move state switching from callback to a trainer.

* Move part of a trainer state switching to a decorator.

* Add documentation.

* Fix docs, rename state enum, restore state to previous on exit if None, add tests for decorator only.

* Fix callback typing.

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-09 06:24:09 -04:00
Rohit Gupta 983c030326
fix reduction docstring and clean tests (#2885)
* fix reduction docstring

* Update docstring and some cleanup

* miss

* suggestion from code review

Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai>

Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai>
2020-08-09 06:03:24 -04:00
William Falcon 256059a1d0
tracks all outputs including TBPTT and multiple optimizers (#2890)
* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update
2020-08-09 06:00:15 -04:00
Rohit Gupta 4d0406ec8b
deepcopy model state_dict in tests (#2887)
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-08 16:13:06 +00:00
Adrian Wälchli f798cffd02
save last model after saving top_k when save_last=True (#2881)
* save_last should be last

* changelog

* seed, docs

* retrigger ci

* compare filenames

* move constants

* fix test

* epoch, global step

* improve test
2020-08-08 06:02:43 -04:00
Jirka Borovec f8c058215f
simplify tests & cleaning (#2588)
* simplify

* tmpdir

* revert

* clean

* accel

* types

* test

* edit test acc

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update test acc

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-07 23:22:05 +02:00
William Falcon f82d7feb6c
updated hooks (#2850)
* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks
2020-08-07 09:29:57 -04:00
ananthsub b39f4798a6
Add support to Tensorboard logger for OmegaConf hparams (#2846)
* Add support to Tensorboard logger for OmegaConf hparams

Address https://github.com/PyTorchLightning/pytorch-lightning/issues/2844

We check if we can import omegaconf, and if the hparams are omegaconf instances. if so, we use OmegaConf.merge to preserve the typing, such that saving hparams to yaml actually triggers the OmegaConf branch

* avalaible

* chlog

* test

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-07 09:13:21 -04:00
Rohit Gupta a642349228
Support limit_mode_batches (int) for infinite dataloader (#2840)
* Support limit_mode_batches(int) for infinite dataloader

* flake8

* revert and update

* add and update tests

* pep8

* chlog

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Add suggestions by @awaelchli

* docs

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

* Apply suggestions from code review

* fix

* max

* check

* add and update tests

* max

* check

* check

* check

* chlog

* tests

* update exception message

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-07 13:02:36 +02:00
Nima Sarang 793036d29c
Support returning python scalars in DP (#1935)
* Override the default gather method to support scalars

* add computing average of a list

* bug: change if to elif

* add some tests

* change style

* change documentation

* use apply_to_collection in DP gather

* use apply_to_collection in DP gather

* fix warning msg

* override gather method in DP

* add tests for python scalars

* add python scalars to docstring

* Update message

* override gather method in DP

* formatting

* chlog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-07 09:18:29 +02:00
Nicki Skafte 9a402461da
Bugfix: Lr finder and hparams compatibility (#2821)
* fix hparams lr finder bug

* add tests for new functions

* better tests

* fix codefactor

* fix styling

* fix tests

* fix codefactor

* Apply suggestions from code review

* modified hook

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-07 00:34:48 +02:00
Jirka Borovec ed3ee982b3
clean tests imports (#2834) 2020-08-06 16:58:51 +02:00
s-rog 9b997c8616
add test for none checkpoint in ddp_spawn (#2845)
* add test for none checkpoint in ddp_spawn

* fix code style

* make sure checkpoint_callback is none

* Fix tests

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-08-06 07:11:43 -04:00
xmotli02 767c44950c
Added basic file logger (#2721)
* Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* csv

* Apply suggestions from code review

* tests

* tests

* tests

* miss

* docs

Co-authored-by: xmotli02 <xmotli02@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-06 06:08:25 -04:00
Younghun Roh ac4a215071
Faster Accuracy metric (#2775)
* Faster classfication stats

* Faster accuracy metric

* minor change on cls metric

* Add out-of-bound class clamping

* Add more tests and minor fixes

* Resolve code style warning

* Update for #2781

* hotfix

* Update pytorch_lightning/metrics/functional/classification.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update about conversation

* Add docstring on stat_scores_multiple_classes

Co-authored-by: Younghun Roh <yhunroh@mindslab.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-06 11:40:35 +02:00
Justus Schock fe29c53ab5
add ddp sync for logging in result step (#2822)
* add ddp sync for logging in result step

* pep8

* pep8

* make ddp tests run also on cpu (except windowws)

* create class instance in ddp test

* revert automated formatting

* pep8
2020-08-05 20:42:09 -04:00
William Falcon b507c42c47
clarify batch hooks (#2842)
* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook
2020-08-05 20:01:30 -04:00
Ananya Harsh Jha a5f2b89ed0
updated sync bn (#2838)
* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* added ddp_spawn test

* updated test

* clean

* clean

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-06 01:12:11 +02:00
William Falcon 5d0f0325d8
Revert "Support limit_mode_batches (int) for infinite dataloader" (#2839)
* Revert "Support limit_mode_batches (int) for infinite dataloader (#2787)"

This reverts commit de9c9f0864.

* Update training_tricks.py
2020-08-05 15:57:26 -04:00
Jeff Yang 5bbcb8db1f
Improve SSIM (#2833)
* make ssim fast

* remove padding

* pep8

* add comments for readability

* plus -> coef
2020-08-05 13:40:11 -04:00
Rohit Gupta de9c9f0864
Support limit_mode_batches (int) for infinite dataloader (#2787)
* Support limit_mode_batches(int) for infinite dataloader

* flake8

* revert and update

* add and update tests

* pep8

* chlog

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Add suggestions by @awaelchli

* docs

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

* Apply suggestions from code review

* fix

* max

* check

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-05 17:04:49 +00:00
Nicki Skafte e3732789d7
Add remaning sklearn metrics (#2562)
* added balanced accuracy

* added dcg score

* added mean absolute error

* added mean squared error

* fix

* added mean squared log error

* add median absolute error and r2 score

* switch arguments

* added mean poisson deviance

* add mean gamma deviance and mean tweedie deviance

* fix styling

* added explained variance score

* added cohen kappa score

* added hamming, hinge, jaccard

* fix styling

* update sklearn requirement to newer version

* update requirement

* fix doctest

* fix tests

* added balanced accuracy

* added dcg score

* added mean absolute error

* added mean squared error

* fix

* added mean squared log error

* add median absolute error and r2 score

* switch arguments

* added mean poisson deviance

* add mean gamma deviance and mean tweedie deviance

* fix styling

* added explained variance score

* added cohen kappa score

* added hamming, hinge, jaccard

* fix styling

* update sklearn requirement to newer version

* fix doctest

* fix tests

* fix doctest

* fix failing docs

* fix test

* trying to fix errors

* Apply suggestions from code review

* format

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-05 11:32:53 +02:00
Justus Schock ad0f1194aa
Support Mean in DDP Sync (#2568)
* Update converters.py

* Update test_converters.py

* pep8

* pep8 tests

* Update test_datamodules.py

* Update test_converters.py

* Update converters.py

* Update test_datamodules.py

* Update test_converters.py

* Update test_converters.py

* fix tests

* fix ddp tests on windows

* chlog

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-04 18:32:20 +02:00
Jirka Borovec 448be60701
update GPU to PT 1.5 (#2779)
* update gpu PT 1.6

* fix docker

* use PT 1.5

* Update tests/install_AMP.sh

Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>

Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>
2020-08-02 08:14:53 -04:00
Rohit Gupta 8baec1a191
Fix shuffle for distributed sampler (#2789)
* Fix shuffle for distributed sampler

* add test

* test

* chlog

* update test

* update test

* update test

* assertions via callback

* define callback outside for pickling

* skip ddp test on windows

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-01 23:22:57 -04:00
Nathan Raw 036bcea499
Call DataModule hooks implicitly in trainer (#2755)
*  call dm hooks in trainer implicitly

*  update tests

* 📝 remove unused stage arg from dm docs

*  update tests

*  update tests

* 🚧 include stage in datamodule.setup

* 📝 docs

* 📝 docs

* added more dm tests

* added more dm tests

* 🐛 call dm.setup everywhere

* 🔥 pickle tests now implied by accelerator tests

* 🎨 set dm as attr of trainer

* 🐛 .

* 🚧 wip

* add can prepare test

* add can prepare test

* verified setup in fit

* fixed setup call

* fixed setup call

* fixed setup call

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-01 20:17:57 -04:00
Jirka Borovec 3772601cd6
update CI testing with pip upgrade (#2380)
* try pt1.5

* cpu

* upgrade

* tpu

* user

* [blocked by #2380] freeze GPU PT 1.4 (#2780)

* freeze

* user
2020-07-31 14:50:06 -04:00
Jirka Borovec bc7a08fbe0
test dockers & add AMP in pt-1.6 (#1584)
* exist images

* names

* images

* args

* pt 1.6 dev

* circleci

* update

* refactor

* build

* fix

* MKL
2020-07-31 08:23:13 -04:00
Thomas Schaaf a6719f09f0
Bugfix/torchtext include lengths (#2689)
* Test using torchtext.data.Field with include_lengths=True/False

* Fix issue that Tensors in a Batch generated by torchtext with torchtext.data.Field configured as include_lengths=True

* Add description for fix of issue #2688

* changes to accomodate CodeFactor issues

* Another attemt to make last CodeFactor issue pass (it's a false alarm)

* temporarly disable test of test_grad_tracking to check if testing will pass

* reenable test in test_grad_norm

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Renamed get_torchtext_data_iterator to _get_torchtext_data_iterator as suggested by @borda

* Update pytorch_lightning/utilities/apply_func.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* adding tests more specific to batch_move_data_to_device with tochtext Batch

* added check that Tensors were moved to target device

* removed tests using RNN models to be moved into a separate PR

* fixing FLAKE8 errors that showed up after merge from master branch
	modified:   tests/base/datamodules.py
	modified:   tests/callbacks/test_model_checkpoint.py

* parameterized test to reduce code duplication

* Added check only if length tensor exist. Removed left over comments.

* rearranged device parameterization and added pytest.param

* Try to figure out why only one device is tested on Linux machines

* Testing on CPU and GPU devices (GPU test is skip if no cuda device is available.

* added test for TPU device (experimental)

* Adding test parameterization for TPU test (experimental)

* change import statement to limit what is imported for a TPU environment

* made test work with TPU

* Change to trigger CI

* Change to trigger CI

* uncommented TPU test to check CI

* reenabling TPU test

* small change to trigger CI build

* small change to trigger CI build

* small change to trigger CI build

* adding tests/utilities/test_apply_func_torchtext.py to CI TPU test

* try to make test not skipped on CI with TPU

* remove testing on TPU

* undo an accidental change to test_tpu.py (file should not have been touched)

* small change to trigger CI build

* small change to trigger CI build

* Update tests/utilities/test_apply_func_torchtext.py

* Revert to previous version

* Apply suggestions from code review

* Change to trigger CI

Co-authored-by: Thomas Schaaf <tschaaf@mmm.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Thomas Schaaf <tschaaf@cs.cmu.edu>
2020-07-31 07:53:08 -04:00
Lezwon Castelino b7afac351b
Add onnx export (#2596)
* export model to onnx

* prepare data before exporting

* support for dataloaders and tensors

* added tests

* use example_input_array
add to changelog

* updated docstring

* added onnx inference tests

* temp commit

* removed schema valid test

* add onnxruntime to environment.yml

* moved onnxruntime to environment.yml pip

* add example in doc

* add lines between code block

* added PR to changelog

* is file check

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* remove *

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* infer example outputs

* added doctest for onnx

* fix windows tests

* moved eval within condition block

* self.forward to self

* added docs

* fixed docs error

* added to toctree

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-31 12:27:57 +02:00
Jirka Borovec 06e8910f06
pytorch 1.6 (#2745)
* pt 1.6

* don't use the new zipfile serialization for now

* quick flake8 fixes

* remove unnecessary f

* coalesce strings

* remove comma

* remove extra commas

* Apply suggestions from code review

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* set _use_new_zipfile_serialization to False only for pytorch 1.6.0

* remove unnecessary comments

* flake8 fixes

* use pkg_resources instead of packaging

* readme

* format

* version

* chlog

Co-authored-by: Peter Yu <peter@asapp.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
2020-07-31 11:18:32 +02:00
Jirka Borovec 949734489a
remove deprecated in v0.9 (#2760)
* remove deprecated in v0.9

* data_loader

* import

* hook

* args
2020-07-30 23:19:28 +02:00
Phil 2f0fb34496
Speed up gradient clipping and allow parameters on multiple devices. (#2767)
The speed up is achieved by:
- Moving the "where" out of the loop (and replacing with min for simplicity).
- Replacing manual sum and pow with torch.norm. Even though this results
  in unnessecary computation (computing pow(root)) this is still a lot
  faster.
- Preallocating the output gives a slight speed up.

Note that calling .to for all parameters results in a small speed
penalty (~4 ms in my case) but allows parameters on different devices.

Overall this reduces the time used for gradient clipping from 206ms to
74 ms for my model (Resnet50 + few additional vars, all vars on GPU).
2020-07-30 11:53:24 -04:00
Ethan Harris 458d3e210e
Add missing methods to logger collection (#2723)
* Add missing methods to logger collection

* Update CHANGELOG.md

* Fix errors after merge

* Fix codefactor issues

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-29 23:53:02 +02:00
Jirka Borovec 590e7fb1fd
tests: add default_root_dir=tmpdir (#2392)
* tests: add default_root_dir=tmpdir

* remove duplicate tmpdir args

* add missing fixture

* test requires multi gpu

* typo

* resize

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-07-28 09:47:53 -04:00
Jirka Borovec 0fe933e23d
fixing TPU tests (#2632)
* init

* rename

* tpu_core_idx

* idx 8

* idxs

* @pl_multi_process_test

* assert

* assert

* deamon

* no close

* imort

* msg

* use_single_gpu

* dataset

* idx

* fix idx

* dataset

* format

* add pickable

* typo

* apex

* typo

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* docs

* typo

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* docs

* docs

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

* docs

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-07-27 19:07:09 -04:00
Rohit Gupta 84c507c4df
Fix max_batches with fast_dev_run. (#2581)
* Fix fast_dev_run to run for all val_dataloaders

* fast_dev_run check

* changelog

* explicit

* limit_batches with fast_dev_run in init

* add test

* whitespace and comment fix

* comment and assertion

* added tests

* Fix fast_dev_run to run for all val_dataloaders

* fast_dev_run check

* changelog

* explicit

* limit_batches with fast_dev_run in init

* add test

* whitespace and comment fix

* comment and assertion

* added tests

* added tests

* added tests

* added tests

* update rtol

* Revert "update rtol"

This reverts commit 4320329540.

* added tests

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-27 17:56:55 -04:00
Adrian Wälchli d03953260d
Fix weights_save_path when logger is used + simplify path handling + better docs (#2681)
* fix weights_save path and drop ckpt_path

* add tests

* unused import

* update docs

* changelog

* pep8

* fix horovod test

* make backward compatible

* perform same test for all loggers

* fix for when logger=False and weights_save_path is set

* update changelog

* update docs

* update tests

* do not set save dir dynamically

* remove duplicate test

* remove duplicated tests

* update tests

* update tests

* remove remaining ckpt_path references

* move defaults to init as suggested by @Borda

* test deprecation
2020-07-27 12:53:11 -04:00
William Falcon 4dbd761a1c
refactor 3/n (#2709)
* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator
2020-07-25 20:56:50 -04:00
Nathan Raw 9076551aec
Enable val/test loop disabling + datamodule tests (#2692)
* 🎨 warn instead of error out on loaders

* 🐛 test misconfiguration should still fail

* 🚧 .

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-25 12:57:40 -04:00
Rohit Gupta cb0c6ad51a
fix setup call while testing (#2624)
* fix setup call while testing

* changelog

* drop if condition

* add test to check setup call

* flake8

* update test to check model stage

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-24 23:57:31 -04:00
Nathan Raw 1caf8beb2c
Datamodule (#2668)
*  Add copy of pl_bolts datamodule to lightning

*  add datamodule to necessary init files

* 🚧 add datamodule property to LightningModule

* 🚧 .

* 🎨 Let DataModule do its own thing

* 🚧 add back setup and run both hooks implicitly

* 🚧 .

* 🐛 fix add_argparse_args

* 💄 apply black formatting and isort

* 📝 docstrings

* 📝 .

* 📝 .

* 🐛 overwrite cls prepare_data instead of instance

* 📝 .

*  add some tests

* Update datamodule.py

* Update datamodule.py

* Update datamodule.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-24 11:42:15 -04:00
Adrian Wälchli 938ec5a6c1
remove duplicate tests (#2685)
* remove duplicate test

* remove duplicated tests
2020-07-24 08:15:40 -04:00
Travis Addair 1369012bc7
Horovod: adjust base LR used by schedulers to scale with the number of workers (#2626)
* Horovod: Adjust base LR used by schedulers to match that of the optimizer after scaling by number of workers

* Added unit test

* Removed debug statements

* Updated changelog

* Apply suggestions from code review

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-23 12:14:57 -04:00
Jeff Yang bda7cf1653
metrics: add SSIM (#2671)
* metrics: add SSIM

* Update CHANGELOG.md

fix codefactor issue

fix doctest

fix doctest

fix test

* added test for raise Error
2020-07-23 12:13:52 -04:00
Adrian Wälchli 1e68968ed7
support num_sanity_val_steps=-1 (#2246)
* support sanity_val_step=-1

* fix list size

* simplification

* simplify

* add test for num_sanity_val_steps=-1

* update test

* update docs

* extend tests to multiple dataloaders

* changelog

* Update tests/trainer/test_trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* improve test

* refactor the sanity check decision

* fix merge

* Update trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-23 07:07:03 -04:00
William Falcon 62ce00f96c
EvalResult support for val loop (PR 3/5) (#2651)
* add EvalResult to support to val/test loops
2020-07-22 13:53:10 -04:00
Jeff Yang 0a65826462
metrics: add BLEU (#2535)
* metrics: added bleu score and test bleu

* metrics: fixed type hints in bleu

* bleu score moved to metrics/functional/nlp.py

* refactor with torch.Tensor

* Update test_sequence.py

* refactor as Borda requests and nltk==3.2

* locked nltk==3.3

* nltk>=3.3, parametrized smooth argument for test

* fix bleu_score example

* added class BLEUScore metrics and test

* added class BLEUScore metrics and test

* update CHANGELOG

* refactor with torchtext

* torchtext changed to optional import

* fix E501 line too long

* add else: in optional import

* remove pragma: no-cover

* constants changed to CAPITALS

* remove class in tests

* List -> Sequence, conda -> pip, cast with tensor

* add torchtext in test.txt

* remove torchtext from test.txt

* bump torchtext to 0.5.0

* bump torchtext to 0.5.0

* Apply suggestions from code review

* ignore bleu score in doctest, renamed to nlp.py

* back to implementation with torch

* remove --ignore in CI test, proper reference format

* apply justus comment

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-22 09:58:24 -04:00
Adrian Wälchli a5538af355
fix dtype/device property not getting updated in submodules (#2657)
* recursive dtype device apply

* simplify

* simple test

* submodule test

* rename

* explicit

* type hints

* test for dp backend

* fix test skip

* rename

* add ddp_spawn test

* fix None index in test

* try fix ddp_spawn test

* changelog

* move _dtype and _device to mixin

* additional doctest
2020-07-21 15:18:57 -04:00
William Falcon 6d10ac2ac8
Structured results (train loop only. val loop separate PR) (PR 2/5) (#2615)
* r

* r

* r

* patched optimizer closure with sr

* patched optimizer closure with sr

* patched optimizer closure with sr

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added autoreduce for train step

* added auto reduce on train

* added auto reduce on train

* added auto reduce on train

* added auto reduce on train

* added auto reduce on train

* added auto reduce on train

* added hooks

* added hooks

* added hooks

* added hooks

* added hooks

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* cache

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* Update pytorch_lightning/callbacks/early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

* Update pytorch_lightning/core/step_result.py

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* simple

* finished tests for structured results on train epoch

* simple

* simple

* revert

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* Update tests/base/deterministic_model.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* finished tests for structured results on train epoch

* docstring typos

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* Update pytorch_lightning/core/step_result.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update pytorch_lightning/overrides/data_parallel.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-07-20 19:00:20 -04:00
William Falcon aaa1553e35
tests for val loop flow (#2605)
* add tests for single scalar return from training

* add tests for single scalar return from training

* add tests for single scalar return from training

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only
2020-07-14 14:20:45 -04:00
William Falcon 1d565e175d
add tests for single scalar return from training (#2587)
* add tests for single scalar return from training

* add tests for single scalar return from training

* add tests for single scalar return from training

* add tests for single scalar return from training

* add tests for single scalar return from training
2020-07-11 17:43:00 -04:00
Jirka Borovec 458bbad550
Avoid zeros in dice and iou (#2567)
* nones

* fix

* fix

* test

* test

* test

* fix

* eps

* tpu

* eps

* type

* test tpu

* Update __init__.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-09 20:40:10 -04:00
William Falcon f35337adba
Fixes .test() for ddp (#2570)
* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint
2020-07-09 18:36:36 -04:00
William Falcon b73812648f
don't pass tpu weights back on test (#2566)
* enable none checkpoint

* enable none checkpoint
2020-07-09 12:11:56 -04:00
Rohit Gupta 6f4a488bae
Add functional regression metrics (#2492)
* Add functional regression metrics

* add functional tests

* add docs

* changelog

* init

* pep8

* docs

* docs

* setup docs

* docs

* Apply suggestions from code review

* Apply suggestions from code review

* typo

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-09 17:54:38 +02:00
William Falcon 4bbcfa04a3
.fit() returns last not best weights in ddp_spawn (#2565)
* added base tests for tpu

* added base tests for tpu

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint
2020-07-09 11:36:21 -04:00
Adrian Wälchli f16b4cfc52
save_dir fix for MLflowLogger + save_dir tests for others (#2502)
* mlflow rework

* logger save_dir

* folder

* mlflow

* simplify

* fix test

* add a test for file dir contents

* new line

* changelog

* docs

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* test for comet logger

* improve mlflow checkpoint test

* prevent  commet logger error on pytest exit

* test tensorboard save dir structure

* wandb save dir test

* skip test on windows

* add mlflow to pickle tests

* wandb

* code factor

* remove unused imports

* remove unused setter

* wandb mock

* wip mock

* wip mock

* wandb tests with mocking

* clean up

* clean up

* comments

* include wandblogger in test

* clean up

* missing argument

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-09 07:15:41 -04:00
Hayden Housen 992a7e2a41
Start accumulate gradients schedule at epoch 0 (continued) (#2513)
* Start accumulate gradients schedule at epoch 0

* Undo change in #2375

* Update test_trainer.py::test_gradient_accumulation_scheduling

* Fix pep8 formatting

* Remove 'Datasets/' folder

* Split args for readability

* Fix pep8 formatting
2020-07-09 07:11:07 -04:00
Espen Haugsdal b3ebfec863
Fix argparse default value bug (#2526)
* Add failing test for bug

* Fix bug
2020-07-09 07:10:30 -04:00
William Falcon a95ef5a4ac
remove parameterize from TPU tests (#2561)
* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu
2020-07-09 06:46:07 -04:00
William Falcon 69cbb62774
Finish #2549 (#2557)
* removed spawns for test_converters and verified tests

Co-authored-by: Ananya Harsh Jha <ahj265@nyu.edu>
Co-authored-by: zcain <zcain@google.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-08 20:33:48 -04:00
Rohit Gupta d3f5717e81
Fix parameters and docs in metrics (#2473)
* Fix parameters and docs in metrics

* doc improvements

* whitespace

* doc indentation

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* zero

* drop defaults

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-08 14:11:40 +02:00
Marijan Smetko 1dc724239a
PSNR metric (#2483)
* Add stub PSNR metric

* Fix linter

* Add data range as parameter

* Add tests

* Add scikit-image

* Add PSNR to regression metrics and add functional

* Refactor to functional

* Fix linter

* Fix linter, again

* Fix linter, again

* Fix typo in test

* Fix typo in another test

* Add scikit-image to conda

* Lift numpy requirement

* Add random tests

* Update CHANGELOG.md

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-08 10:26:11 +02:00
Anthony Bisulco 899cd74044
flatten Wandb hyperparameters dict (#2459)
* wandb logging fix

* Changelog fix

* change test
2020-07-08 07:45:25 +02:00
Adrian Wälchli 78db847e42
Fixed skipped horovod tests (#2514)
* skip ckpt test on rank  > 0

* fx test

* add extra assert

* code factor

* add back removed

* add old loading code

* add back old

* unused import

* add same skip to run_model_without_loggers

* test if horovod now works with python 3.8

* test remove all 3.8 skips

* remove spawn

* fix

* fix test

* move load check up

* fix test multigpu

* rename

* fix gpu mode

* on gpu fix when on cpu

* move
2020-07-07 14:54:07 -04:00
William Falcon 11069c8784
Fix ddp tests + .test() (#2512)
* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* fix deprecation warnings

* added base tests for tpu

* added base tests for tpu

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-07-07 12:24:56 -04:00
Jirka Borovec 977df6ed31
Docker: building XLA base image (#2494)
* refactor

* add TPU base

* wip

* builds

* typo

* extras

* simple

* unzip

* rename
2020-07-06 14:21:36 -04:00
Jeremy Jordan a91b06ed1e
fix worker warning (#2504)
* fix worker warning

* improve tests

* suggestion

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-06 15:45:43 +02:00
vr140 96b32bee04
[tiny] Fix training_dataloader usage to be train_dataloader instead. (#2521)
Co-authored-by: Vijay Rajaram <vrajaram3@gatech.edu>
2020-07-06 10:44:44 +02:00
Adrian Wälchli 1098a0d725
make loggers pickleable (#2518)
* state updates to logger

* change log

* changelog
2020-07-05 19:57:22 -04:00
Adrian Wälchli 6bfcfa8671
fix dtype conversion of example_input_array in model summary (#2510)
* fix dtype conversion

* changelog
2020-07-05 07:17:22 -04:00
William Falcon 9924c76faa
Amp2 (#2505)
* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang
2020-07-04 22:52:49 -04:00
William Falcon 020c332ae9
Clean up (#2467)
* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test
2020-07-03 00:38:29 -04:00
Adrian Wälchli 927f305f7e
Warn user when IterableDataset has __len__ defined (#2437)
* add warning when getting checking len

* added test

* changelog

* pep

* do not show warning below 1.4

* try version parse

* comments

* xfail

* Update requirements/base.txt

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/data_loading.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* version

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-01 07:53:19 -04:00
Adrian Wälchli 145670f893
fix logging on rank 0 only (#2425)
* fix and test for ddp block logging rank > 0

* rename

* use the dummy logger

* dummy logger test

* set the logger in  model

* decorator for rank zero experiment

* simplify check

* simplify

* fix problem with None in checkpoint path

* revert configure logger

* unused import

* offline

* try rank 0 decorator in checkpoint

* try fix test

* imgs

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* fix tpu tests

* fix tpu tests

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-30 18:09:16 -04:00
William Falcon a42a0e16dd
Fixes train outputs (#2428)
* fix outputs

* fix outputs
2020-06-30 10:03:49 -04:00
Adrian Wälchli 25ee51bc57
Continue Jeremy's early stopping PR #1504 (#2391)
* add state_dict for early stopping

* move best attr after monitor_op defined

* improve early stopping and model checkpoint callbacks

* fix formatting

* fix attr init order

* clean up setting of default_root_dir attr

* logger needs default root dir set first

* reorg trainer init

* remove direct references to checkpoint callback

* more fixes

* more bugfixes

* run callbacks at epoch end

* update tests to use on epoch end

* PR cleanup

* address failing tests

* refactor for homogeneity

* fix merge conflict

* separate tests

* tests for early stopping bug regressions

* small fixes

* revert model checkpoint change

* typo fix

* fix tests

* update train loop

* cannot pass an int as default_save_path

* refactor log message

* fix test case

* appease the linter

* fix some doctests

* move config to callback

* fixes from rebase

* fixes from rebase

* chlog

* docs

* reformat

* formatting

* fix

* fix

* fixes from rebase

* add new test for patience

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/callbacks/test_early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* fix formatting

* remove enable_early_stop attribute

* add state_dict for early stopping

* move best attr after monitor_op defined

* improve early stopping and model checkpoint callbacks

* fix formatting

* fix attr init order

* clean up setting of default_root_dir attr

* logger needs default root dir set first

* reorg trainer init

* remove direct references to checkpoint callback

* more fixes

* more bugfixes

* run callbacks at epoch end

* update tests to use on epoch end

* PR cleanup

* address failing tests

* refactor for homogeneity

* fix merge conflict

* separate tests

* tests for early stopping bug regressions

* small fixes

* revert model checkpoint change

* typo fix

* fix tests

* update train loop

* fix test case

* appease the linter

* fix some doctests

* move config to callback

* fixes from rebase

* fixes from rebase

* chlog

* docs

* reformat

* formatting

* fix

* fix

* fixes from rebase

* add new test for patience

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/callbacks/test_early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* fix formatting

* remove enable_early_stop attribute

* fix test with new epoch indexing

* fix progress bar totals

* fix off by one error (see #2289) epoch starts at 0 now

* added missing imports

* fix hpc_save folderpath

* fix formatting

* fix tests

* small fixes from a rebase

* fix

* tmpdir

* tmpdir

* tmpdir

* wandb

* fix merge conflict

* add back evaluation after training

* test_resume_early_stopping_from_checkpoint TODO

* undo the horovod check

* update changelog

* remove a duplicate test from merge error

* try fix dp_resume test

* add the logger fix from master

* try remove default_root_dir

* try mocking numpy

* try import numpy in docs test

* fix wandb test

* pep 8 fix

* skip if no amp

* dont mock when doctesting

* install extra

* fix the resume ES test

* undo conf.py changes

* revert remove comet pickle from test

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update weights_loading.rst

* Update weights_loading.rst

* Update weights_loading.rst

* renamed flag

* renamed flag

* revert the None check in logger experiment name/version

* add the old comments

* _experiment

* test chckpointing on DDP

* skip the ddp test on windows

* cloudpickle

* renamed flag

* renamed flag

* parentheses for clarity

* apply suggestion max epochs

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jeremy Jordan <jtjordan@ncsu.edu>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-28 21:36:46 -04:00
Jirka Borovec 1e16681693
fix loading with hparams (#2403)
* fix #2386

* extra test

* extra case

* extra test

* chlog

* fix test
2020-06-28 20:22:03 -04:00
Jirka Borovec 861a73be12
fix loading past checpoints (#2405)
* fix #2334

* chlog
2020-06-28 17:20:33 -04:00
Jirka Borovec 51711c265a
fix loading model with kwargs (#2387)
* test

* fix

* fix
2020-06-27 16:38:03 -04:00
Mateusz Pieniak e82d9cdb66
Support torchtext on a single GPU (#2379)
* Handle torchtext.data.Batch on GPU

* Update CHANGELOG.md

* Apply code review requests

* Correct the docs

* Change requirements
2020-06-27 16:36:45 -04:00
Jirka Borovec 41f5df18a4
move Trains logger to Bolts (#2384)
* move Trains logger

* chlog
2020-06-27 09:14:05 -04:00
Jirka Borovec f1c96930b1
repair CI for Win (#2358)
* no cov

* no cov

* ReduceOp

* group

* reduce_op.sum

* Update sklearns.py

* formatting

* horovod

* Apply suggestions from code review

* horovod

* horovod

* horovod

* horovod

* ci

* print

* ci

* timeout

* timeout

* time

* fix

* distributed cpu

* pipes

* time

* cpu

* spawn

* spawn

* spawn

* tp

* separate

* os

* os

* npm

* Fix load_from_checkpoint() not working with URL on Windows

* Update CHANGELOG

* Update CHANGELOG.md

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* Apply suggestions from code review

* fix

* fix meta tags creating empty lines

* pyright

* node

* fix httpserver address

* drop tutils.default_trainer_options

* imports

* Better fix for load_from_checkpoint() not working with absolute path on Windows (#2294)

* Fix load_from_checkpoint() not working with URL on Windows

* Update CHANGELOG

* Update CHANGELOG.md

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* drop duplicate

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: airium <airium@outlook.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: AIRIUM <38249940+airium@users.noreply.github.com>
2020-06-26 21:38:25 -04:00
Jirka Borovec a5f45787ea
fix get dataloader size (#2375)
* get dataloader size

* pyright
2020-06-26 15:38:48 -04:00
Thomas Schaaf 7c0a3f4745
Bugfix/_has_len (#2307)
* deal with NotImplementedError raised by torchtext

* deal with NotImplementedError raised by torchtext

* Added tests for dataloader which raise NotImplementedError in __len__()

* Fixed some typos

* enabled tests for dataloader raising NotImplementedError in __len__ and corrected match string for raised exception

* deleted empty line for style compliance

* refactored CustomNotImplementedErrorDataloader to derive from CustomInfDataloader

* enabled reduced number of not_implemented_error dataloader test to reduce runtime for continuous integration

* reduced test number of not_implemented_error dataloader test further to reduce test time

* reduced test number of not_implemented_error dataloader test to one to reduce test time

* disabled all not_implemented_error dataloader test to see if test pass in time

* added __next__ with a reduced number (5) of elements after which CustomNotImplementedErrorDataloader stops to speedup test.

* enabling all not_implemented_error dataloader test

* added brief description of change and relation of torchtext

* CustomNotImplementedErrorDataloader reduced number of batches served to 2.

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Apply suggestions from code review

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Disable parallelism in dataloader

Suspect that it might cause pytest to hang more frequent

* added max_steps=None to Trainer in not_implemented_error dataloader tests

* rearranged not_implemented_error test in file to group them together

* disabled parallel data loading
Reason: testing if that stops the test framework from hanging.

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Thomas Schaaf <tschaaf@cs.cmu.edu>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-26 09:31:08 -04:00
William Falcon f2710bb500
adds tensorboard hparams logging test (#2342)
* fixes hparam logging

* fixes hparam logging

* fixes hparam logging

* fixes hparam logging

* fixes hparam logging

* Apply suggestions from code review

* skipif

* rename

* Update test_tensorboard.py

* Update test_tensorboard.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-25 09:22:28 -04:00
Adrian Wälchli aab9e77d2d
Fix lost compatibility with custom datatypes implementing `.to` (#2335)
* generalize data transfer

* added test

* update docs

* fix spelling error

* changelog

* update docs
2020-06-23 23:41:02 -04:00
William Falcon 598f5140c5
refactor training loop (#2336)
* refactoring training epoch

* refactored training epoch

* refactored training epoch

* refactored training epoch

* refactored training epoch

* refactored training epoch

* fixes slurm weights saving

* fixes slurm weights saving
2020-06-23 23:38:22 -04:00
Lezwon Castelino 9446390779
fix TPU parsing and TPU tests (#2094)
* added tpu params test

* added tests

* removed xla imports

* added test cases for TPU

* fix pep 8 issues

* refactorings and comments

* add message to MisconfigurationException

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* test if device is set correctly

* added TPU device check
removed mark.spawn

* removed device selection

* remove xla_device call

* readded spawn due to test failures

* add TODO for tpu check

* Apply suggestions from code review

* Apply suggestions from code review

* flake8

* added tpu args to cli tests

* added support for tpu_core selection via cli

* fixed flake formatting

* replaced default_save_path with default_root_dir

* added check for data type for tpu_cores

* fixed flake indent

* protected

* protected

* added tpu params test

* added tests

* removed xla imports

* test if device is set correctly

* added support for tpu_core selection via cli

* replaced default_save_path with default_root_dir

* added check for data type for tpu_cores

* chlog

* fixed tpu cores error

* rebased with latest changes

* flake fix

* Update pytorch_lightning/trainer/distrib_parts.py

added suggesstion

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-23 12:06:57 -04:00
Adrian Wälchli e085e93dd3
Add missing test for "multiple dataloader + percent_check fix" (#2226)
* Init fix num_batches

* Fix num_batches in case of multiple dataloaders

* Apply suggestions from code review

* Changes based on suggestions

* Flake8

* Add test to check num_batches

* generalize dataloader percent check test

* fix formatting

* remove hparams

* tests

* CHANGELOG

* Update CHANGELOG.md

* max_batches can be int

* conflict and rebase

* add back the test


fix


fix message


0.0 works


Revert "fix message"

This reverts commit 839cacf8b8610f4e697e654ef6f3d2501bf23984.

* update changelog

* Update CHANGELOG.md

* Fix num batches in case of multiple dataloaders and percent_check (#1920)

* git conflict

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* missing union

* doc update suggestion by @rohitgr7

* extend test

* changelog

* docs add note about multiple loaders

* update changelog

* remove unused variable

Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-23 11:21:24 -04:00
William Falcon 0f073819d3
refactored training_batch + tests to verify correctness (#2328)
* refactored training_bath

* refactored training_bath

* refactored training_bath

* refactored training_bath

* refactored training_bath

* refactored training_bath

* refactored training_bath

* refactored training_bath

* refactored training_bath

* refactored training_bath

* refactored training_bath
2020-06-23 11:17:10 -04:00
Tri Dao 29179dbfcc
Fix ROC metric for CUDA tensors (#2304)
* Fix ROC metric for CUDA tensors

Previously roc metric (and auroc) errors when passed in CUDA tensors,
due to torch.tensor construction without specifying device.
This fixes the error by using F.pad instead.

* Update test_classification.py

* Update test_classification.py

* chlog

* Update test_classification.py

* Update test_classification.py

* Update tests/metrics/functional/test_classification.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update test_classification.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-23 15:19:16 +02:00
elias-ramzi 92f122e0df
Fix average_precision metric (#2319)
* Fixed average_precision metric, parenthesis were missing. Added test test that failed with the old implementation

* Modified CHANGELOG.md

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-23 13:21:00 +02:00
Adrian Wälchli f972ab3a82
Fix summary hook handles not getting removed (#2298)
* detach hooks after completion

* detach hook

* update docs

* add test

* docs

* changelog
2020-06-20 07:38:47 -04:00
Jirka Borovec 4b90b79080
check omegaconf gpus (#2273)
* check omegaconf gpus

* test

* test

* Apply suggestions from code review

Co-authored-by: Omry Yadan <omry@fb.com>

Co-authored-by: Omry Yadan <omry@fb.com>
2020-06-19 23:42:11 -04:00
Jirka Borovec 7ecb0d2528
test CLI parsing gpus (#2284)
* cli gpus

* test

* test
2020-06-19 23:41:42 -04:00
Jirka Borovec f278ac42c8
Revert/Fix: epoch indexing from 1, to be from 0 (#2289)
* Revert "deprecated: epoch indexing from 1 (#2206)"

This reverts commit f94b919b

* chlog

* grad index

* Apply suggestions from code review

* tests

* fix

* test
2020-06-19 23:39:53 -04:00
thschaaf 554fb4754c
Bugfix/_has_len (#2293)
* deal with NotImplementedError raised by torchtext

* deal with NotImplementedError raised by torchtext

* Added tests for dataloader which raise NotImplementedError in __len__()

* Fixed some typos

Co-authored-by: Thomas Schaaf <tschaaf@cs.cmu.edu>
2020-06-19 23:38:15 -04:00
Jirka Borovec e0b7fed92e
deprecated Trainer proc_rank (#2269)
* deprecated

* test
2020-06-19 15:46:27 -04:00
Sam Shleifer e780072961
Attempt to add broken test to mimic transformers use case (#2272)
* Attempt to add broken test

* use wandb logger

* Update test_amp.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-19 14:43:07 -04:00
William Falcon 03ab574b0f
decrease some training times (#2256) 2020-06-18 23:30:16 -04:00
William Falcon 6ae9a97b09
remove frame inspection on self.hparams (#2253)
* remove frame inspection on self.hparams

* remove frame inspection on self.hparams

* remove frame inspection on self.hparams

* remove frame inspection on self.hparams

* remove frame inspection on self.hparams

* remove frame inspection on self.hparams
2020-06-18 23:08:25 -04:00
Vincent Thibault 4903f9ebd4
Fixed the load_from_checkpoint path detected as URL bug (#2244)
* Fixed the load_from_checkpoint path detected as URL bug

* Fixed the load_from_checkpoint path detected as URL bug

* fixed Caps lock typo

* Added .absolute() to checkpoint path to force hard drive prefix in string
2020-06-18 17:53:51 -04:00
j-dsouza e0b7359555
[metrics] IoU Metric (#2062)
* add iou function

* update stat scores

* add iou class

* add iou tests

* chlog

* Apply suggestions from code review

* tests

* docs

* Apply suggestions from code review

* docs

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-18 09:06:31 -04:00
William Falcon 79e1426161
Docs clean-up (#2234)
* update docs

* update docs

* update docs

* update docs

* update docs

* update docs
2020-06-18 08:29:18 -04:00
William Falcon 34816e9ec4
adds setup+teardown hook (#2229)
* allow regression metrics to import

* allow regression metrics to import

* allow regression metrics to import

* allow regression metrics to import

* allow regression metrics to import

* allow regression metrics to import

* allow regression metrics to import

* allow regression metrics to import

* allow regression metrics to import
2020-06-17 19:49:58 -04:00
Xavier Sumba ead874b17d
Regression metrics (#2221)
* add regression metrics

* solve tests

* add docs
2020-06-17 13:44:06 -04:00
William Falcon 2411c3be70
replace train_percent_check with limit_train_batches (#2220)
* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* drop train_percent_check

* chlog

* deprecated

* deprecated

* deprecated

* tests

* tests

* Apply suggestions from code review

* tests

* hydra support

* tests

* hydra support

* hydra support

* hydra support

* tests

* typo

* typo

* Update test_dataloaders.py

* docs

* docs

* docs

* docs

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-17 13:42:28 -04:00
William Falcon 04c794ca72
[WIP] Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) (#2213)
* fixed percent check for val/test

* fixed percent check for val/test

* fixed percent check for val/test

* fixed percent check for val/test

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* overfit_pct now uses train loaders for val and test and does not shuffle

* add on fit_start on fit_end hooks

* add on fit_start on fit_end hooks

* add on fit_start on fit_end hooks

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-17 08:03:28 -04:00
William Falcon e1f238a097
add on fit_start on fit_end hooks (#2217)
* add on fit_start on fit_end hooks

* add on fit_start on fit_end hooks

* add on fit_start on fit_end hooks
2020-06-17 07:37:16 -04:00
Nicki Skafte f1c732a77b
Metric docs fix (#2209)
* fix docs

* Update docs/source/metrics.rst

* Update docs/source/metrics.rst

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update docs/source/metrics.rst

* Update docs/source/metrics.rst

* Update metrics.rst

* title

* fix

* fix for num_classes

* chlog

* nb classes

* hints

* zero division

* add tests

* Update metrics.rst

* Update classification.py

* Update classification.py

* prune doctests

* docs

* Apply suggestions from code review

* Apply suggestions from code review

* flake8

* doctests

* formatting

* cleaning

* formatting

* formatting

* doctests

* flake8

* docs

* rename

* rename

* typo

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
2020-06-17 07:34:39 -04:00
William Falcon 55fbcc00f6
Metrics docs (#2184)
* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* add workers fix

* add workers fix

* add workers fix

* add workers fix

* add workers fix

* add workers fix

* add workers fix

* add workers fix

* add workers fix

* add workers fix

* Update docs/source/metrics.rst

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* Update docs/source/metrics.rst

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* Update docs/source/metrics.rst

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* Update docs/source/metrics.rst

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* add workers fix

* add workers fix

* add workers fix

* doctests

* add workers fix

* add workers fix

* fixes

* fix docs

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* add workers fix

* Update docs/source/metrics.rst

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* doctests

* add workers fix

* fix docs

* fixes

* fixes

* fix doctests

* Apply suggestions from code review

* fix doctests

* fix examples

* bug

* Update docs/source/metrics.rst

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update docs/source/metrics.rst

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update docs/source/metrics.rst

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* fixes

* fixes

* fixes

* fixes

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-06-16 07:42:56 -04:00
Jirka Borovec e289e45120
test: save hparams to yaml (#2198)
* save hparams to yaml

* import

* resolves

* req

* Update requirements/base.txt

Co-authored-by: Omry Yadan <omry@fb.com>

Co-authored-by: Omry Yadan <omry@fb.com>
2020-06-16 06:34:55 -04:00
Jirka Borovec f94b919b96
deprecated: epoch indexing from 1 (#2206)
* epoch indexing from 1

* chlog

* fix tests

* fix tests

* self.min_epochs
2020-06-16 06:33:41 -04:00
Jirka Borovec 8870a84aa8
reduce test warnings (#2202)
* reduce test warnings

* Update test_trainer.py

* Update test_trainer.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-15 23:06:17 -04:00
Jirka Borovec db7bb4c348
cleaning tests (#2201) 2020-06-15 22:03:40 -04:00
Adrian Wälchli 7dc58bd286
Refactor model summary + generalize example input array (#1773)
* squash

variant a


variant b


add test


revert rename


add changelog


docs


move changelog entry to top


use hooks


wip


wipp


layer summary


clean up, refactor


type hints


rename


remove obsolete code


rename


unused imports


simplify formatting of table and increase readability


doctest


superclass object


update examples


print unknown sizes


more docs and doctest


testing


unknown layers


add rnn test


remove main


restore train mode


test device wip


device


constant


simplify model forward transfer


return summary object in method


extend tests


fix summary for empty module


extend tests


refactor and added hook


variant a


variant b


add test


revert rename


add changelog


docs


move changelog entry to top


remove hardcoded string


simplify


test unknown shapes and all others


comments for tests


fix hparams attribute

* update default

* unused import

* clean up

* replace hardcoded strings

* fix doctest

* fix top/full

* black

* fix rnn test

* fix rnn

* update debugging docs


update docs


typo


update docs


update docs

* add changelog

* extract constant

* setter and getter

* move parity models to test folder

* parameterize mode
2020-06-15 17:05:58 -04:00
Adrian Wälchli 22d9464e56
HenryJia: auto-move data decorator (#1905)
* First attempt at auto-moving data for inference

* Correct my copypaste errors

* Correct for if device is CPU

* Get rid of the WIP code I accidentally added

* Add tests

* Make tests more foolproof

* Make sure we stick with pep8 formatting

* Clarify docs a little

* Apply suggestions from code review

* Get everything working again hopefully

* refactor and added hook


variant a


variant b


add test


revert rename


add changelog


docs

* move changelog entry to top

* Move data transfer to utilities

* Add back in warnings for autotransfer

* Get rid of the test code I ended up accidentally commiting again

* Add docs any changelog

* Correct PR number in Changelog

* Correct changelog

* Update data.py

* Update test_cpu.py

* make a decorator

* type hint

* changelog

* changelog

* remove old function

* import

* test for decorator

* fix test

* remove old test

* doctest

* apply decorator directly

* convert doctest to code block

* prevent side effects in tests

* fix merge

* update forward docs

* update docs

* added docs in section "deployment / prediction"

* update changelog

Co-authored-by: Hengjian Jia <henryjia18@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-15 17:04:32 -04:00
Peter Yu 37e7582486
Add ckpt_path option to LightningModule.test() (#2190)
* Add ckpt_path option to LightningModule.test()

If ckpt_path is "best" (default), it loads the best weights saved by ModelCheckpoint for the test loop.
If ckpt_path is a path to a checkpoint file, it loads the weights from the file for the test loop.
If ckpt_path is None, it uses the weights from the end of training for the test loop.
If model parameter is set, ckpt_path is ignored.

* Update test_set.rst

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-15 08:02:37 -04:00
Simon-Martin Schröder fd1693e289
Handle KeyboardInterrupt during training (#2134)
* Handle KeyboardInterrupt during training

Fixes #2079.

* chlog

* Fix whitespace

* Update callback_hook.py

* Update base.py

* Update training_loop.py

* Update test_trainer.py

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update CHANGELOG.md

* on_keyboard_interrupt

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-06-15 12:35:26 +02:00
Nicki Skafte 02262d0a93
Fix for accuracy calculation (#2183)
* accuracy_fix

* fix line length

* Apply suggestions from code review

* Update test_classification.py

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-14 18:14:29 -04:00
Jirka Borovec c0903b800d
past checkpoints (#2160)
* past checkpoints

* omegaConf save

* enforce type

* resolve=True

Co-authored-by: Omry Yadan <omry@fb.com>

* test omegaconf

* tests

* test past

Co-authored-by: Omry Yadan <omry@fb.com>
2020-06-14 11:36:45 -04:00
Jirka Borovec bfaabd7b7f
clean requirements (#2128)
* clean requirements

* missing

* missing

* req

* min

* default >> base

* base.txt
2020-06-13 10:15:22 -04:00
Justus Schock 3436d00230
Native torch metrics (#1488)
* Create metric.py

* Create utils.py

* Create __init__.py

* Create __init__.py

* Create __init__.py

* add tests for metric utils

* add tests for metric utils

* add docstrings for metrics utils

* add docstrings for metrics utils

* add function to recursively apply other function to collection

* add function to recursively apply other function to collection

* add tests for this function

* add tests for this function

* add tests for this function

* update test

* update test

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* update metric name

* remove example docs

* fix tests

* fix tests

* add metric tests

* fix to tensor conversion

* fix to tensor conversion

* fix apply to collection

* fix apply to collection

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove tests from init

* remove tests from init

* add missing type annotations

* rename utils to convertors

* rename utils to convertors

* rename utils to convertors

* rename utils to convertors

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* add doctest example

* rename file and fix imports

* rename file and fix imports

* added parametrized test

* added parametrized test

* replace lambda with inlined function

* rename apply_to_collection to apply_func

* rename apply_to_collection to apply_func

* rename apply_to_collection to apply_func

* Separated class description from init args

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* adjust random values

* suppress output when seeding

* remove gpu from doctest

* Add requested changes and add ellipsis for doctest

* Add requested changes and add ellipsis for doctest

* Add requested changes and add ellipsis for doctest

* forgot to push these files...

* forgot to push these files...

* forgot to push these files...

* add explicit check for dtype to convert to

* add explicit check for dtype to convert to

* fix ddp tests

* fix ddp tests

* fix ddp tests

* remove explicit ddp destruction

* remove explicit ddp destruction

* New metric classes (#1326)

* Create metrics package

* Create metric.py

* Create utils.py

* Create __init__.py

* add tests for metric utils

* add docstrings for metrics utils

* add function to recursively apply other function to collection

* add tests for this function

* update test

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* update metric name

* remove example docs

* fix tests

* add metric tests

* fix to tensor conversion

* fix apply to collection

* Update CHANGELOG.md

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove tests from init

* add missing type annotations

* rename utils to convertors

* Create metrics.rst

* Update index.rst

* Update index.rst

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* add doctest example

* rename file and fix imports

* added parametrized test

* replace lambda with inlined function

* rename apply_to_collection to apply_func

* Separated class description from init args

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* adjust random values

* suppress output when seeding

* remove gpu from doctest

* Add requested changes and add ellipsis for doctest

* forgot to push these files...

* add explicit check for dtype to convert to

* fix ddp tests

* remove explicit ddp destruction

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* add function to reduce tensors (similar to reduction in torch.nn)

* add functionals of reduction metrics

* add functionals of reduction metrics

* add more metrics

* pep8 fixes

* rename

* rename

* add reduction tests

* add first classification tests

* bugfixes

* bugfixes

* add more unit tests

* fix roc score metric

* fix tests

* solve tests

* fix docs

* Update CHANGELOG.md

* remove binaries

* solve changes from rebase

* add eos

* test auc independently

* fix formatting

* docs

* docs

* chlog

* move

* function descriptions

* Add documentation to native metrics (#2144)

* add docs

* add docs

* Apply suggestions from code review

* formatting

* add docs

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>

* Rename tests/metrics/test_classification.py to tests/metrics/functional/test_classification.py

* Rename tests/metrics/test_reduction.py to tests/metrics/functional/test_reduction.py

* Add module interface for classification metrics

* add basic tests for classification metrics' module interface

* pep8

* add additional converters

* add additional base class

* change baseclass for some metrics

* update classification tests

* update converter tests

* update metric tests

* Apply suggestions from code review

* tests-params

* tests-params

* imports

* pep8

* tests-params

* formatting

* fix test_metrics

* typo

* formatting

* fix dice tests

* fix decorator order

* fix tests

* seed

* dice test

* formatting

* try freeze test

* formatting

* fix tests

* try spawn

* formatting

* fix

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Xavier Sumba <c.uent@hotmail.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-06-13 08:47:25 -04:00
Jirka Borovec 2674976f2c
remove deprecated API for v0.8 (#2073)
* remove deprecated API

* chlog

* times

* missed

* formatting check

* missing

* missing

* miss

* fix docs build error

* fix pep whitespace error

* docs

* wip

* amp_level

* amp_level

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-06-12 14:37:52 -04:00
Peter Yu 06cd849538
Allow loading checkpoints from urls (#1667)
* allow loading checkpoints from urls

* tmpdir_server fixture

* test cases for loading checkpoints from url

* dir => root_dir

* default map_location to None

* test case for resume_from_checkpoint

* changelog

* doc update

* monkeypatch TORCH_HOME to avoid caching

* Use a threading server with random ports so that it is easier to clean up

* test fixes

* pep8 fix

* ThreadingHTTPServer support in 3.6

* pep8 fix

* fix changelog

* separate tests for urls

* typo

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-11 17:12:48 -04:00
Justus Schock bd49b07fbb
Rework of Sklearn Metrics (#1327)
* Create utils.py

* Create __init__.py

* redo sklearn metrics

* add some more metrics

* add sklearn metrics

* Create __init__.py

* redo sklearn metrics

* New metric classes (#1326)

* Create metrics package

* Create metric.py

* Create utils.py

* Create __init__.py

* add tests for metric utils

* add docstrings for metrics utils

* add function to recursively apply other function to collection

* add tests for this function

* update test

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* update metric name

* remove example docs

* fix tests

* add metric tests

* fix to tensor conversion

* fix apply to collection

* Update CHANGELOG.md

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove tests from init

* add missing type annotations

* rename utils to convertors

* Create metrics.rst

* Update index.rst

* Update index.rst

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* add doctest example

* rename file and fix imports

* added parametrized test

* replace lambda with inlined function

* rename apply_to_collection to apply_func

* Separated class description from init args

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* adjust random values

* suppress output when seeding

* remove gpu from doctest

* Add requested changes and add ellipsis for doctest

* forgot to push these files...

* add explicit check for dtype to convert to

* fix ddp tests

* remove explicit ddp destruction

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* add sklearn metrics

* start adding sklearn tests

* fix typo

* return x and y only for curves

* fix typo

* add missing tests for sklearn funcs

* imports

* __all__

* imports

* fix sklearn arguments

* fix imports

* update requirements

* Update CHANGELOG.md

* Update test_sklearn_metrics.py

* formatting

* formatting

* format

* fix all warnings and formatting problems

* Update environment.yml

* Update requirements-extra.txt

* Update environment.yml

* Update requirements-extra.txt

* fix all warnings and formatting problems

* Update CHANGELOG.md

* docs

* inherit

* docs inherit.

* docs

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* docs

* req

* min

* Apply suggestions from code review

Co-authored-by: Tullie Murrell <tulliemurrell@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Tullie Murrell <tulliemurrell@gmail.com>
2020-06-10 15:43:12 +02:00
Jirka Borovec 16a7326e52
test cloudpickle (#2105)
* cloudpickle

* ci tests
2020-06-09 16:51:30 -04:00
William Falcon 479ab49d03
temporarily fixes early stopping bug (#2119)
* fixes early stopping bug

* fixes early stopping bug

* fixes early stopping bug

* fixes early stopping bug

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* added test
2020-06-08 19:28:26 -04:00
Jirka Borovec d2967d9305
update hparams, allow OmegaConf (#2047)
* DictConf

* inits

* Apply suggestions from code review

Co-authored-by: Omry Yadan <omry@fb.com>

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* atrib

* wip

* wip

* wip

* added hparams test

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* Update test_hparams.py

* added hparams test

* added hparams test

* pep8

* pep8

* pep8

* docs

* wip

* wip

* clean

* review @omry

* Update docs/source/hyperparameters.rst

Co-authored-by: Omry Yadan <omry@fb.com>

Co-authored-by: Omry Yadan <omry@fb.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-08 07:19:34 -04:00
Jirka Borovec c09317e68f
cleaning (#2030)
* cleaning

* optim imports

* fix

* typo

* on

* mergify
2020-06-04 11:25:07 -04:00
William Falcon d96df75d6a
testing new speed (#1587)
* fixed new amp bugs

* fixed new amp bugs

* fixed new amp bugs

* try exit

* larger dataset

* full mnist

* full mnist

* trainer

* assert

* .05

* .10, #4

* #5

* #5

* #5

* refactor

* abs diff

* speed

* speed

* speed

* speed

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-04 11:20:12 -04:00
Adrian Wälchli 4234992302
Fix local variables being collected into module_arguments dict (#2048)
* do not include local vars in auto collection

* add test

* add test for model with "self" renamed to "obj"

* skip decorator

* changelog

* changelog

* update docs

* remove obsolete child collection

* generalize **args, **kwargs names

* docs

* also update varargs passed in

* Revert "also update varargs passed in"

This reverts commit 3d7a30dbee07a513ee13e1cc3e08ca5ccdb85734.

* update test
2020-06-04 08:35:50 -04:00
kumuji fd7814d287
Added black formater for the code with code-checker on pull (#1610)
* black

Added throught black.toml other options are hard so far

No caching for black github action

Moved from black.toml to pyproject.toml

Exclude not only yml but also yaml

Update pyproject.toml

Co-authored-by: Thomas Johansen <thomasjo@gmail.com>

Update .github/workflows/code-formatting-check.yml

mergify

Remove formating check

E231 error ignoring because of black formating

Updated CONTRIBUTING to the master

* Update .github/workflows/code-formatting-check.yml

* Bump black to 19.10b0 version

* resolved incorrect merge of CONTRIBUTING,

Black skipping string normalization

* Minor fixes in CONTRIBUTING, two typos

* Update setup.cfg

* chlog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-03 18:23:14 +02:00
Jirka Borovec c438d0dd90
increase acc (#2039)
* increase acc

* try 0.45

* @pytest

* @pytest

* try .50

* duration

* pytest
2020-06-03 08:28:19 -04:00
Adrian Wälchli 8211256c46
data transfer model hook (+ refactor) (#1756)
* refactor and added hook


variant a


variant b


add test


revert rename


add changelog


docs

* resolve merge duplication

* overridden typo

* fix test

* tpu id

* raise if TPU not available

* re-use apply_to_collection function for parsing collections

* comment

* make utility function available to user

* documentation

* move changelog entry to top

* fix tpu transfer call

* fix call

* remove hardcoded string

* improve test

* call model hook by default

* Apply suggestions from code review

* rename utility function

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 21:45:19 -04:00
Devashish Shankar ade3f36b7a
Raise an error when lightning replaces an existing sampler (#2020)
* Raise an error when lightning replaces an existing sampler

Currently, Trainer replaces the existing sampler with DistributedSampler
if running distributing training and `replace_sampler_ddp=True` (default
behaviour). If a user has configured an existing sampler, this would
lead to widely different results if running a distributed vs
non-distributed training.

This PR fixes this by raising an Error if user has configured a sampler
and uses `replace_sampler_ddp=True`. The recommended behavior from now
on is to either remove the sampler or set `replace_sampler_ddp=False`

* Fix tests

* Simpler fix

* Fix tests

* Make inner method protected

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 18:52:04 -04:00
Ivan Nazarov e85a646a41
Mistake in parameters' grad norm tracking (#2012)
* fix grad norm formula

* grad-norm tracker test

* fixed seed and explicit rtol in grad norm tracking test

* a docstring for grad-norms and forced cast to float of norm_type

* support for inf-norm

* renamed the grad norm test

* docs

* fixed language in docstring

* Apply suggestions from code review

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 18:51:09 -04:00
Udit Arora 26b69917b4
Add Open MPI installation details for horovod (#2050) 2020-06-02 18:48:26 -04:00
Boris Dayma 00f1ac11e6
fix(wandb): use same logger on multiple training loops (#2055)
* fix(wandb): use same logger on multiple training loops

New training loops reset step to 0 which would previously try to overwrite logs

fix #2015

* docs(changelog.md): add reference to PR 2055
2020-06-02 18:46:02 -04:00
William Falcon 82a20296e3
Replaces ddp .spawn with subprocess (#2029)
* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix
2020-06-01 11:00:32 -04:00
William Falcon 0e37e8c4d2
hotfix to unblock hparams and OmniConf - removes auto_register_init_args by default (#2025)
* ogc install

* cleaned up tests

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix
2020-05-31 08:29:51 -04:00
Jirka Borovec 9893681859
fix changelog (#1864)
* fix chlog

* test for #1729

* hist

* update

* Document use case of passing test dataloaders to Trainer.test() (#1992)

* Issue 1990 Doc patch.

* Codeblock directive.

* Update to reflect current state of pytorch-lightning

* Final grammar cleaning. I hope these commits are squashed.

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: authman <uapatira@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-31 00:48:05 -04:00
Jirka Borovec df78e84060
unify tests (#1940)
* unify tests

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-27 22:45:23 -04:00
Ivan Nazarov 7c19c373ac
LearningRateLogger in multi-scheduler setting (#1944)
* fixed undesired behaviour due to dict.fromkeys

* a test for log length consistency

* runtime-warn if no schedulers are configured

* chlog

* move

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-05-27 22:44:46 -04:00
Jirka Borovec 5e8c5abf63
fix default arg (#1927)
* fix default

* formatting errors

* update

* flake8
2020-05-26 19:04:42 -04:00
Adrian Wälchli 34237cfcaf
handle unknown args passed to Trainer.from_argparse_args (#1932)
* filter valid args

* error on unknown manual args

* added test

* changelog

* update docs and doctest

* simplify

* doctest

* doctest

* doctest

* better test with mock check for init call

* fstring

* extend test

* skip test on 3.6 not working

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-25 16:01:29 -04:00
Adrian Wälchli 8ca8336ce5
protect progress bar callback (#1855)
* wip protected progress bar settings

* remove callback attr from LRfinder

* whitespace

* changelog
2020-05-25 07:49:23 -04:00
Lucas Vazquez 112dd5c4f6
Adds the option of saving the last model on checkpoint (#1908)
* saves model every epoch

* implement test for save_last

* Update CHANGELOG.md

* Update CHANGELOG.md

* changes test description

Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>

Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-05-25 07:47:44 -04:00
Nicki Skafte a34eb9e169
Fix logger bug and prepare data bug (#1933)
* tests, fix logger bug and prepare data bug

* add CHANGELOG.md

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-05-25 07:43:56 -04:00
William Falcon caa9c6760b
replace Hparams by init args (#1896)
* remove the need for hparams

* remove the need for hparams

* remove the need for hparams

* remove the need for hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* finished moco

* basic

* testing

* todo

* recurse

* hparams

* persist

* hparams

* chlog

* tests

* tests

* tests

* tests

* tests

* tests

* review

* saving

* tests

* tests

* tests

* docs

* finished moco

* hparams

* review

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* hparams

* overwrite

* transform

* transform

* transform

* transform

* cleaning

* cleaning

* tests

* examples

* examples

* examples

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* chp key

* tests

* Apply suggestions from code review

* class

* updated docs

* updated docs

* updated docs

* updated docs

* save

* wip

* fix

* flake8

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-24 18:59:08 -04:00
Justus Schock 9b629637b8
New metric classes (#1326) (#1877)
* New metric classes (#1326)

* Create metrics package

* Create metric.py

* Create utils.py

* Create __init__.py

* add tests for metric utils

* add docstrings for metrics utils

* add function to recursively apply other function to collection

* add tests for this function

* update test

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* update metric name

* remove example docs

* fix tests

* add metric tests

* fix to tensor conversion

* fix apply to collection

* Update CHANGELOG.md

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove tests from init

* add missing type annotations

* rename utils to convertors

* Create metrics.rst

* Update index.rst

* Update index.rst

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* add doctest example

* rename file and fix imports

* added parametrized test

* replace lambda with inlined function

* rename apply_to_collection to apply_func

* Separated class description from init args

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* adjust random values

* suppress output when seeding

* remove gpu from doctest

* Add requested changes and add ellipsis for doctest

* forgot to push these files...

* add explicit check for dtype to convert to

* fix ddp tests

* remove explicit ddp destruction

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* move dtype device mixin to more general place

* refactor to general device dtype mixin

* add initial metric package description

* change default to none for mac os

* pep8

* fix import

* Update index.rst

* Update ci-testing.yml

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update CHANGELOG.md

* Update pytorch_lightning/metrics/converters.py

* readme

* Update metric.py

* Update pytorch_lightning/metrics/converters.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-05-19 11:05:07 -04:00
Rohit Gupta ac76dfcf62
Remove NaNs from loss in LRFinder (#1862)
* Remove NaNs from loss in LRFinder

* np.isfinite

* chlog

* add test

* chlog

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-05-19 08:39:19 +02:00
Jirka Borovec a153fe4c2a
fix codecov reports (#1867)
* fix codecov

* upgrade codecov

* upgrade codecov
2020-05-18 20:34:59 -04:00
Lezwon Castelino 7c7e50ca47
Allow user to select individual TPU core to train on (#1729)
* added tpu_id

added tpu_id to mixins

* train on individual tpu

* parallel loader if tpu_id is None

* removed progress_bar_refresh_rate

* chlog

* replaced num_tpu_cores with tpu_cores

* set tpu_id to None if int

* changed num_tpu_cores to tpu_cores in docs

* updated docs

* updated __init__.py
removed self.tpu_id for ParallelLoader

* Update pytorch_lightning/trainer/__init__.py

* check if tpu_cores is a list

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* xla device conditional

* num_tpu_cores deprecation

* removed duplicate warning

* fixed pep8 error

* Revert "removed duplicate warning"

This reverts commit 8adb0a9b

* deprecated api update

* fixed recursion error

* fixed tests

* fixed flake errors

* removed current_tpu_index

* Update CHANGELOG.md

* Update trainer.py

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-17 16:30:54 -04:00
Victor Quach 1a797bdad5
add test for trainer.test() (#1858)
* fix trainer.test()

* Update trainer.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-17 16:30:20 -04:00
Fabio Natanael Kepler 8c4c7b105e
Fix `save_weights_only` flag in ModelCheckpoint (#1780)
* Add flag to `dump_checkpoint` for only including weights

`ModelCheckpoint` then passes `self.save_weights_only` to the save function.

* Fix tests and add changelog entry

* Add check and descriptive message when training state is restored from a weights only checkpoint

Also add a test for making sure `ModelCheckpoint.save_weights_only` works as expected.

* Fix weights-only test to properly match expected exception

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-17 09:24:17 -04:00
Adrian Wälchli 769a459d27
remove extra kwargs from Trainer init (#1820)
* remove kwargs

* remove useless test

* rename unknown trainer flag

* trainer inheritance and test

* blank line

* test for unknown arg

* changelog
2020-05-17 09:14:54 -04:00
Jirka Borovec bee0392c37
extend arg parser (#1842)
* extend arg parser

* flake8

* tests

* example

* fix test
2020-05-14 17:56:11 -04:00
William Falcon 53d9316a56
fixes ddp bugs (#1819)
* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug
2020-05-13 19:17:04 -04:00
William Falcon 648d516668
Use store_true for bool args (#1822)
*  Use store_true for bool args

* debug

Co-authored-by: Nate Raw <nxr9266@g.rit.edu>
2020-05-13 19:12:06 -04:00
Nicki Skafte 663b90035c
Bugfix: accumulation and suggestion for learning rate finder (#1801)
* fix suggestion being too naive

* fix accumulation error and added new tests

* fix styling

* update CHANGELOG.md

* update based on review

* fix tests

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-13 14:40:44 -04:00
So Uchida 22d7d03118
Replace meta_tags.csv with hparams.yaml (#1271)
* Add support for hierarchical dict

* Support nested Namespace

* Add docstring

* Migrate hparam flattening to each logger

* Modify URLs in CHANGELOG

* typo

* Simplify the conditional branch about Namespace

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update CHANGELOG.md

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* added examples section to docstring

* renamed _dict -> input_dict

* mata_tags.csv -> hparams.yaml

* code style fixes

* add pyyaml

* remove unused import

* create the member NAME_HPARAMS_FILE

* improve tests

* Update tensorboard.py

* pass the local test w/o relavents of Horovod

* formatting

* update dependencies

* fix dependencies

* Apply suggestions from code review

* add savings

* warn

* docstrings

* tests

* Apply suggestions from code review

* saving

* Apply suggestions from code review

* use default

* remove logging

* typo fixes

* update docs

* update CHANGELOG

* clean imports

* add blank lines

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* back to namespace

* add docs

* test fix

* update dependencies

* add space

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-13 15:05:15 +02:00
Jirka Borovec 10ce1c0256
device property (#1791)
* device property

* add/copy properties

* inherit

* rename

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* dtype

* prop

* pt api

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-05-12 23:18:39 -04:00
kumuji 619f984c36
Option to provide seed to random generators to ensure reproducibility (#1572)
* Option to provide seed to random generators to ensure reproducibility

I added small function in utilities which imports torch, numpy, python
random and sets seed for all of the libraries to ensure reproducibility
of results.

* Apply recommendations from core contributors on seeding

1. Moved the seeding code to another file
2. Make deterministic as a parameter for trainer class
3. Add assertions for seeding numpy
4. Added warnings
5. torch.manual_seed should be enough for seeding torch

* Revert "Apply recommendations from core contributors on seeding"

This reverts commit a213c8e6882eec8a9e7408b9418926d2db7c5461.

* Revert "Revert "Apply recommendations from core contributors on seeding""

This reverts commit 59b2da53c62878de7aab0aa3feb3115e105eea06.

* Change in test, for correct seeding

* Allow seed equal to 0

* Allow seed to be uint32.max

* Added deterministic to benchmarks

* Cuda manual seed as in benchmark seeding

* Seeding should be done before model initialization

* cuda manual_seed is not necessary

* Fixing seed test_cpu_lbfgs

On some seeds seems like lbfgs doesn't converge.
So I fixed the seed during testing.

* rebasing issue with old reproducibility.py

* Improved documentation and ability to seed before initializing Train
class

* Change in docs

* Removed seed from trainer, update for documentation

* Typo in the docs

* Added seed_everything to _all_

* Fixing old changes

* Model initialization should be earlier then Trainer

* Update pytorch_lightning/trainer/__init__.py

From Example to testcode

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Fixing according to the contributors suggestions

* Moving horovod deterministic to Trainer class

* deterministic flag affects horovod docs update

* Improved static typing

* Added deterministic to test runners of horovod

It is failing on some versions, not very predictable

* static seeds for horovod tests

* Change for reset_seed function in tests

* Seeding horovod using reset_seed from tutils

* Update pytorch_lightning/trainer/__init__.py

* chlog

* Update trainer.py

* change "testcode" to "Example" in trainer init documentation

* Update pytorch_lightning/trainer/seed.py, first line in comment

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-12 07:53:20 -04:00
William Falcon 10b16dbfab
made ddp the default if no backend specified with multiple GPUs (#1789)
* made ddp the default if no backend specified with multiple GPUs

* fix

* spawn

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-12 06:54:23 -04:00
Kevin Chen de1fdd8d3b
Removed test_dataloader call in check_testing_model_configuration (#1670)
* Removed test_dataloader call

* Check if test_dataloader is actually overriden

* Fixed method spelling

* Replaced lambdas

* Replaced None with super method

* Fixed testpass
2020-05-12 00:08:07 -04:00
William Falcon 5bb6b41b78
dataloaders with fast_dev_run (#1787)
* dataloaders with fast_dev_run

* dataloaders with fast_dev_run

* dataloaders with fast_dev_run

* fix

* pep 8
2020-05-11 23:32:44 -04:00
Jirka Borovec 9d2df24d6b
RC & Docs/changelog (#1776)
* missing

* RC

* tol

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* test

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-11 21:57:53 -04:00
Rohit Gupta d962ab5d89
Fix lr key name in case of param groups (#1719)
* Fix lr key name in case of param groups

* Add tests

* Update test and added configure_optimizers__param_groups

* Update CHANGELOG
2020-05-10 17:05:34 -04:00
Piotr Łusakowski 0cb6767465
Fix NeptuneLogger to work in ddp mode (#1753) 2020-05-10 13:19:18 -04:00
Jirka Borovec 134eb61e1a
Tests: refactor cleanup (#1744)
* wip

* cleaning

* optim imports

* -

* default hparams

* fix restore

* fix imports
2020-05-10 13:15:28 -04:00
Nicki Skafte 4970927ec8
Feature: auto scale batch size (#1638)
* auto batch finder

* fix styling

* add description

* add different modes

* fix copy paste error

* better organised code

* fix styling

* add tests

* fix

* fix

* add some documentation

* added CHANGELOG.md

* some documentation

* update based on review

* Update trainer.py

* Update docs/source/training_tricks.rst

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/test_trainer_tricks.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/test_trainer_tricks.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* use EvalModelTemplate

* param tests

* rename

* wrap params

* rename function

* rename

* rename param

* fix

* abs

* rename

* refactor code

* add docs

* try

* arg

* loop

* exept

* loop

* drop bool

* docs

* docs

* added check and test for passing dataloader to fit

* styling fix

* update based on review

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-09 08:28:36 -04:00
Adrian Wälchli 25bbd059df
Also update progress_bar in training_epoch_end (#1724)
* update prog. bar metrics on train epoch end

* changelog

* wip test

* more thorough testing

* comments

* update docs

* move test

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-08 23:31:56 -04:00
Shunta Komatsu f656882942
Fix typo (#1750) 2020-05-07 09:25:54 -04:00