Commit Graph

791 Commits

Author SHA1 Message Date
William Falcon 5abf7d9123
ref: move lr_finder (#3434)
* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder
2020-09-09 22:12:27 -04:00
William Falcon b36c5e86d0
ref: trainer argparse 1/n (#3421)
* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n
2020-09-09 12:31:17 -04:00
Patrick Orlando 656c1af0df
Get experiment_id from MLFlow only once instead of each training loop (#3394)
* Get experiment_id from MLFlow only once instead of each training loop.

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* add test that asserts mlflow client is called to retrieve experiment id only once

* make pep8 happy

* logs

Co-authored-by: Patrick Orlando <patrick.orlando@rea-group.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-09-09 11:38:26 +02:00
Adrian Wälchli e245065fbc
limit auto scaling batch size to the size of the training dataset (#3271)
* fix

* fix and test

* fix merge error

* test for max dataset size

* changelog

* update docs

* fix merge

* unused imports

* imports
2020-09-09 10:51:43 +02:00
William Falcon 8f6b115511
ref: added model connector (#3407)
* ref: added model connector

* ref: added model connector

* ref: added model connector
2020-09-09 00:24:20 -04:00
William Falcon 722c44c7d0
ref: device to gpus (#3405)
* ref: device to gpus

* ref: device to gpus

* ref: device to gpus

* ref: device to gpus

* ref: device to gpus
2020-09-08 22:14:17 -04:00
Travis Addair 091d37f968
Added check for apex AMP and unit tests for Horovod + AMP (#3404)
* Added check for apex AMP and unit tests for Horovod + AMP

* Changelog

* Fixed order of Horovod and Apex optimizer wrapping
2020-09-08 20:30:57 -04:00
William Falcon aaf26d70c4
ref: device parser (#3400)
* ref: train loop refactors part 2: 1/n

* ref: device parser

* ref: device parser

* ref: device parser

* ref: device parser

* ref: device parser

* ref: device parser

* ref: device parser

* ref: device parser
2020-09-08 18:46:42 -04:00
William Falcon ff5f099cb7
ref: remove inner train loop 1/n (#3397)
* ref: remove inner train loop 1/n

* ref: remove inner train loop 1/n
2020-09-08 12:05:00 -04:00
William Falcon d438ad8a8d
ensure calling test multiple times does not change results (#3391) 2020-09-07 22:25:12 -04:00
William Falcon b76d9e5dd5
Refa22 (#3388)
* ref: inner train loop (intermediate step) 20/n

* ref: inner train loop (intermediate step) 21/n

* ref: inner train loop (intermediate step) 21/n

* ref: inner train loop (intermediate step) 21/n

* ref: inner train loop (intermediate step) 21/n

* ref: inner train loop (intermediate step) 21/n
2020-09-07 16:45:31 -04:00
William Falcon 0b5b70d6c9
ref: inner train loop (intermediate step) 17/n (#3376)
* ref: inner train loop (intermediate step) 17/n

* ref: inner train loop (intermediate step) 17/n

* ref: inner train loop (intermediate step) 17/n
2020-09-07 09:31:42 -04:00
William Falcon 69e3f904df
ref: inner train loop (intermediate step) 16/n (#3375)
* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n
2020-09-06 21:57:20 -04:00
William Falcon 7073de8a95
ref: inner train loop (intermediate step) 14/n (#3373)
* ref: inner train loop (intermediate step) 14/n

* ref: inner train loop (intermediate step) 14/n
2020-09-06 19:55:18 -04:00
William Falcon 85421466ab
ref: inner train loop (intermediate step) 10/n (#3369) 2020-09-06 08:59:58 -04:00
Rohit Gupta 24809b0b26
Refactor GPUStatsMonitor to improve training speed (#3257)
* Refactor GPUMonitor to improve training speed

* added gpu ids to monitor

* update tests

* added deprecation warning

* pep

* fix test

* fix docs

* fix log_gpu_memory

* move deprecation check

* chlog

* Update CHANGELOG.md

* suggestions and fix

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-09-04 06:02:16 -04:00
Adrian Wälchli 48c22c8bad
update batch size in DataModule when auto scaling batch size (#3266)
* fix datamodule hasattr

* fix patch check

* fix setattr

* update docs

* revert patch fix

* changelog

* fix datamodule passed in as fit arg

* docs

* set datamodule batch size in lightning_setattr

* fix merge

* check with has_attr

* access datamodule via trainer

* pass fit args down to tuner

* docs

* fix typos in docs

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-09-03 22:07:49 +02:00
Adrian Wälchli 4ad5a78dce
to_torchscript method for LightningModule (#3258)
* script

* docs

* simple test

* move test

* fix doctest

* no grad context

* extend tests


test


test

* datamodule test

* clean up test

* docs

* name

* fix import

* update changelog

* fix import

* skip pytorch 1.3 in test

* update codeblock

* skip bugged 1.4

* typehints

* doctest not working on all pytorch versions

* rename TestGAN to prevent pytest interference

* add note about pytorch version

* fix torchscript version inconsistency in tests

* reset training state + tests

* update docstring

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* update docstring, dict return

* add docs to index

* add link

* doc eval mode

* forward

* optional save to file path

* optional

* test torchscript device

* test save load with file path

* pep

* str

* Commit typing suggestion

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* skip test if cuda not available

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2020-09-03 20:24:44 +02:00
Rohit Gupta 4a22fca524
Changed LearningRateLogger to LearningRateMonitor (#3251)
* Change LearningRateLogger to LearningRateMonitor

* file rename

* docs

* add LearningRateLogger with deprecation warning

* deprecated LearningRateLogger

* move deprecation check

* chlog

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-09-03 18:17:15 +00:00
HT Liu d521c1b178
Fix: gather_all_tensors cross GPUs in DDP (#3319)
* Fix: gather_all_tensors cross GPUs in metrics

* add a test case for gather_all_tensors_ddp in #3253
2020-09-03 12:27:32 +02:00
William Falcon 0d90d53a81
ref: moving train loop to own object 2/n (intermediate steps) (#3313)
* ref: moving train loop to own object 2/n (intermediate steps)

* ref: moving train loop to own object 2/n (intermediate steps)
2020-09-01 21:06:40 -04:00
Nicki Skafte b66ce88f0d
[metrics] Renaming of precision recall metric (#3308)
* rename metrics

* update docs
2020-09-01 14:59:33 -04:00
William Falcon 7d57f8d407
ref: move prepare_data to data connector (#3307)
* ref: moved argparse code to central class

* ref: moved argparse code to central class

* ref: moved argparse code to central class
2020-09-01 14:59:09 -04:00
Lezwon Castelino 3910ad0330
bugfix/3185 transpose (#3252)
* change t() to transpose() as xla devices do not support .t() on 1-dim tensor

* detach tensor before copying

* Revert "detach tensor before copying"

This reverts commit 37cc7bbe

* changed dims

* added test_result_obj_on_tpu

* detach before copying

* detach before copying

* detach before copying

* replace torch.cat with sum
2020-09-01 09:17:52 -04:00
William Falcon 805ff37e8c
ref: .tune() (temporary) (#3293)
* ref: .tune()

* ref: .tune()

* ref: .tune()

* ref: .tune()

* ref: .tune()

* ref: .tune()
2020-08-31 17:36:09 -04:00
William Falcon caf7893f27
ref: modular is_overridden (#3290)
* ref: modular is_overridden

* ref: modular is_overridden

* ref: modular is_overridden

* ref: modular is_overridden
2020-08-31 12:12:02 -04:00
Carlos Mocholí cc80749c7e
Parse Union[bool, str] arguments (#3235)
* Parse Union[bool, str] arguments

* Address review

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-29 10:39:42 -04:00
Jeremy Jordan a5d1176cf6
callback method for on_save_checkpoint (#2501)
* initial draft

* fix test

* Update pytorch_lightning/trainer/callback_hook.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* fix tests

* remove old code

* untested upgrade script

* document limitations

* clean up and add tests

* Update pytorch_lightning/trainer/training_io.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* reflect PR comments

* fix formatting

* Update docs/source/callbacks.rst

* clarify docs

* revert change for loading checkpoints

* small edits

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-28 16:50:52 +02:00
monney d5254ff9df
warn user when dropping unpicklable hparams (#2874)
* refactored clean_namespace

* Update try except to handle pickling error

* Consolidated clean_namespace. Added is_picklable

* PEP8

* Change warning to use rank_zero_warn. Added Test to ensure proper hparam filtering

* Updated imports

* Corrected Test Case
2020-08-28 09:07:43 +02:00
Rohit Gupta 85cd558a3f
Follow up of #2892 (#3202)
* Follow up of #2892

* typo

* iterabledataset
2020-08-27 15:28:29 -04:00
Rohit Gupta f03943ee94
Fix GpuUsageLogger to work on different platforms (#3008)
* Fix GpuUsageLogger

* docstrings

* misconfigexception

* add basic tests

* skip doctest

* fix parameter and docstring

* rm cl

* skip doctest

* cleanup

* chlog

* add suggestions from review

* add test from suggestions

* fix import

* fix test

* fix test

* fix test

* fix test

* rename GpuUsageLogger to GPUStatsMonitor

* doc fix

* Apply suggestions from code review

* update docs format

* update docs

* miss

* merge

* fix title formatting

* unindent

* punctuation

* simplify if statements

* fix test

* suggestions

* pep

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* fix on_train_batch_*

* use AttributeDict

* usage

* rank zero

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* import

* minor changes

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Adrian Wälchli <adrian.waelchli@inf.unibe.ch>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-27 19:50:32 +02:00
William Falcon f3c63f7746
tests to ensure correct dataloader calls (#3221)
* tests to ensure correct dataloading interval and sequence

* tests to ensure correct dataloading interval and sequence

* tests to ensure correct dataloading interval and sequence

* tests to ensure correct dataloading interval and sequence

* tests to ensure correct dataloading interval and sequence
2020-08-27 09:49:46 -04:00
William Falcon a1705441a9
ref: remove _evaluate fx (#3197)
* remove _evaluate

* remove _evaluate

* remove _evaluate

* remove _evaluate

* remove _evaluate

* remove _evaluate

* remove _evaluate

* remove _evaluate
2020-08-26 12:28:14 -04:00
Lezwon Castelino d9ea25590e
fix ONNX model save on GPU (#3145)
* added to(device)

* added test

* fix test on gpu

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* remove multi gpu check

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* updated message

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* updated test

* onxx to onnx

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/models/test_onnx.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* add no grad

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* add isinstance back

* chlog

* error is input_sample is not Tensor

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-26 16:22:19 +00:00
Sordie 888340d17e
Fix RMSLE metric (#3188)
* fix rmsle

* Updated test to match rmsle fix

* Updated RMSLE example result to match functional

* chlog

* add randomized test

* fix pep8

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-08-26 08:02:53 -04:00
Nicki Skafte 17d8773106
New modular metric interface (#2528)
* new base structure

* missing packages

* updated interface

* revert some changes

* fixes

* add changelog

* fix bug

* added description

* test for pickable

* fixing test

* fixing test

* fix pickle issue

* reduceop typehints back

* remove redundant module arg

* add save/load test

* add aggregate method

* text clarification

* fix doctest

* Apply suggestions from code review

* change test to results obj

* fix docs

* formatting

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* formatting

* pep

* Update CHANGELOG.md

* suggestions

* fix tests

* fix pep8

* fix tests

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-08-26 13:01:29 +02:00
William Falcon bda1400225
ref: restore on_eval_start hook (#3183)
* restore eval loop hook
2020-08-26 00:45:43 -04:00
William Falcon 2f6d82e0e6
ref: remove on_eval_start hook (#3176)
* remove on_eval_start hook

* remove on_eval_start hook
2020-08-25 22:28:00 -04:00
William Falcon 6068b29d29
ref: remove obscure forward call in eval + CPU backend ___step (#3123)
* remove obscure forward call in eval

* remove obscure forward call in eval

* remove obscure forward call in eval

* remove obscure forward call in eval

* remove obscure forward call in eval

* remove obscure forward call in eval
2020-08-24 12:31:40 -04:00
Uladzislau Sazanovich 2d42ec008f
Make trainer.state a read-only property (#3109)
* Make trainer.state a read-only property

* Update states.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-24 16:49:33 +02:00
William Falcon 8d7ca5cd2c
ref: refactored gpu backend __step (#3120)
* refactored gpu backend __step

* refactored gpu backend __step

* refactored gpu backend __step

* refactored gpu backend __step
2020-08-24 09:22:05 -04:00
Jirka Borovec 45e7491dcc
drop packaging (#3105) 2020-08-24 05:28:56 -04:00
s-rog 7b054399c6
fix tb hparams logging (#2974)
* log_hyperparams add default metric

also adds scalar support

* fix typos and style

* another typo

* keep original logging implementation

* remove missed line

* fix capitalization

* add step to leg_metrics for tests

* disable hp metric none (-1) logging

to pass tests

* initial arg implementation

* add step to log_metrics

* add hp_metric case to log test

* add docs 

and minor formatting

* fix broken else

* pep8 style

* edit tests

* Update pytorch_lightning/loggers/tensorboard.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update pytorch_lightning/loggers/tensorboard.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-08-24 06:57:04 +00:00
Rohit Gupta 34c88d127b
Fix log_graph in TensorBoardLogger (#3092) 2020-08-22 06:35:09 -04:00
Rohit Gupta 7cca3859a7
Fix num_sanity_val_steps is clipped to limit_val_batches (#2917)
* Fix num_sanity_val_steps according to limit_val_steps

* fix test

* add num_sanity_batches

* pep

* update docstring in test

* add more test

* chlog

* update comments and docstring in test

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Adrian Wälchli <adrian.waelchli@inf.unibe.ch>
Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai>
2020-08-21 20:11:31 +02:00
Jirka Borovec bcdb750976
changelogs clean (#3082)
* clean

* ver
2020-08-20 22:58:53 +00:00
Nathan Raw bab89b8d21
Add transfer_batch_to_device hook to DataModule (#3038)
*  add dm to_device logic in trainer

* 🔥 remove unnecessary comment

*  add to_device logic to datamodule

*  add test

* updated docs

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-20 08:47:11 -04:00
Peter Yu cee5eaf659
flake8 fixes (#3064)
* flake8 fixes

* fix pep8

* fix pep8

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-20 07:45:22 -04:00
Peter Yu 88886ace72
More robust way of collecting init argument names for LightningModules (#3066)
When a LightningModule inherits from a class that implements `__new__()` such as `typing.Generic`, `inspect.signature(cls)` short-circuits and returns the signature of `__new__()` instead of `__init__()`. So, we need to be more specific and call inspection directly on the init function.
2020-08-20 07:19:11 -04:00
William Falcon 3453bba898
re-enabled naming metrics in ckpt name (#3060)
* re-enabled naming metrics in ckpt name

* re-enabled naming metrics in ckpt name

* re-enabled naming metrics in ckpt name

* re-enabled naming metrics in ckpt name

* re-enabled naming metrics in ckpt name

* re-enabled naming metrics in ckpt name
2020-08-19 20:34:09 -04:00
Nicki Skafte cefc7f7c32
Feature/log computational graph (#3003)
* add methods

* log in trainer

* add tests

* changelog

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* text

* added argument

* update tests

* fix styling

* improve testing
2020-08-19 19:08:46 -04:00
Adrian Wälchli 7b917de946
fix setting batch_size attribute in batch_size finder (finishing PR #2523) (#3043)
* lightning attr fix

* revert refactor

* create test

* separate test

* changelog update

* tests

* revert

* Update pytorch_lightning/trainer/training_tricks.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-19 19:01:55 -04:00
Justus Schock 7358d456f3
Retrieve last logged val from result by key (#3049)
* return last logged value

* Update test_results.py

* Update step_result.py

* Update step_result.py

* pep8

* pep8
2020-08-19 18:59:14 -04:00
Adrian Wälchli 89a5d8fee9
fix auto scale batch size not working with precision=16 (#3045)
* add test

* test

* test

* add fix

* changelog

* check batch size changed
2020-08-19 20:41:33 +00:00
Adrian Wälchli 9031dc3b81
Fix result gathering with varying tensor shapes (#3020)
* test for gethering results

* fix gather

* document tests

* changelog

* assert dtype

* default to concat

* additional test
2020-08-18 20:27:48 -04:00
William Falcon 8315a65d0a
fix result obj dp auto reduce (#3013)
* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* fix result for dp

* added warning when changing monitor and using results obj
2020-08-17 10:29:39 -04:00
William Falcon 51de6802ed
added warning when changing monitor and using results obj (#3014)
* added warning when changing monitor and using results obj

* added warning when changing monitor and using results obj

* added warning when changing monitor and using results obj
2020-08-17 10:29:28 -04:00
William Falcon 465d4ffd2c
added lr scheduler test using dev debugger (#3004)
* added lr scheduler test using dev debugger

* added lr scheduler test using dev debugger

* added lr scheduler test using dev debugger
2020-08-16 11:37:38 -04:00
Adrian Wälchli 188e06c261
ddp fix for trainer.test() + add basic ddp tests (#2997)
* add ddp script variations

* add ddp test

* rename

* shell

* test

* test

* try call

* try without subprocess

* test

* display the error

* list all variations

* try string

* try copy env

* debug

* pythonpath

* path

* update test

* change

* simple ddp test

* replace

* remove random port

* random port

* str

* clean up

* check run spawn

* clean up

* docs

* docs

* update test

* docs

* changelog

* changelog
2020-08-16 11:19:57 -04:00
William Falcon d702d4d393
removed callback metrics from test results obj (#2994)
* removed callback metrics from test results obj

* removed callback metrics from test results obj
2020-08-15 21:45:41 -04:00
William Falcon 766d0f391b
re-trigger build (#2988)
* fixed build

* fixed build
2020-08-15 21:13:00 -04:00
Jeff Yang 73ebd1066d
Fix accumulate_grad_batches for last batch (#2853)
* first attempt

* update changelog

* fix pep8 and tests

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* added new tests

* fixed tests

* Apply suggestions from code review

* used num_training_batches

* fixed pep8

* fixed with is_last_batch suggested by @awaelchli

* fixed with num_training_batches

* fixed with num_training_batches

* cleanup

* fix test and update docs

* fixed for alignment, update docs

* minor changes

* update doc

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-15 15:06:37 -04:00
William Falcon b8371fa56c
Fixes #2972 #2946 (#2986)
* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add val step arg to metrics

* add step metrics

* add step metrics
2020-08-15 08:36:00 -04:00
Nathan Raw b9695237f1
Save test predictions on multiple GPUs (#2926)
* Save test predictions on multiple GPUs
2020-08-14 17:52:43 -04:00
Lezwon Castelino cfd06a083b
Bugfix/2956 tpu distrib backend fix (#2959)
* override dist backend when using tpus

* added test

* updated doc string

* drop redundant info...

* more redundant info

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-08-13 18:57:23 -04:00
shijianjian 18d31a3b63
Added strict=False for load_from_checkpoint (#2819)
* Added strict=False and hparams_file accepcts dict

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Type check fix

* Added tests

* Linting & test fix

* Removed redundant code & test

* Added strict=False and hparams_file accepcts dict

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Type check fix

* Added tests

* Linting & test fix

* Removed redundant code & test

* Apply suggestions from code review

* tests

* tests

* chlog

* Update tests/models/test_restore.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* update test comments

* Added docstring for the strict attribute

* Added supplementary tests

* Update saving.py

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* pep8, removed extra func

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai>
2020-08-13 16:25:43 -04:00
Jirka Borovec 4354690e55
add apex test (#2921)
* add apex test

* rename

* level

* events

* wrap

* evt

* miss

* apex

* apex

* apex

* apex

* apex

* apex

* Update tests/models/test_amp.py

Co-authored-by: William Falcon <waf2107@columbia.edu>

* notes

* notes

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-13 10:03:13 -04:00
Jirka Borovec 665c1507f0
deterministic=True (#2944) 2020-08-13 06:29:27 -04:00
Adrian Wälchli 411914bd2b
Fix hparams loading for model that accepts *args (#2911)
* fix hparams loading for model that accepts *args

* add test case

* changelog

* pep

* fix test

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-12 09:58:35 -04:00
William Falcon d13e5c9e53
document lightiningmodule better (#2920)
* updated docs
2020-08-11 19:39:43 -04:00
Adrian Wälchli 69d241c82e
Do not pass non_blocking=True if it does not support this argument (#2910)
* add docs

* non blocking only on tensor

* changelog

* add test case

* add test comment

* update changelog


changelog


chlog
2020-08-11 19:28:37 -04:00
William Falcon 28f79d9f7a
Mapkeys (#2900)
* added a map dict

* added a map dict
2020-08-09 18:50:39 -04:00
Adrian Wälchli 1ac507a255
constant root seed in reset_seed (tests) (#2895)
* fix root_seed in reset_seed

* seed value
2020-08-09 21:23:01 +00:00
Caldera 6c18fd9a24
Update lr_logger.py (#2847)
* Update lr_logger.py

when logging learning_rate, we should provide different choices to log including 'step' and 'epoch'

* Update lr_logger.py

add some type annotations and docstrings

* Update lr_logger.py

fixed a bug where `on_train_batch_start()` can't be triggered, instead, we should use on_batch_start(); add `interval` args so that we can record learning_rates with respect to `global_step` or `current_epoch`.

* Update lr_logger.py

restore _extract_lr()

* suggestion

* Update lr_logger.py

modify _extract_lr(), it no more need to pass `interval` parameter.

* Update test_lr_logger.py

SkafteNicki 's suggetion

* log_interval now supports `None`, `step`, `epoch`

* change `log_interval` to `logging_interval`

* Update test_lr_logger.py

* Update lr_logger.py

* put types check into `on_train_start()`

* cleanup

* docstring typos

* minor changes from suggestions

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-09 16:30:43 +00:00
Uladzislau Sazanovich e9846dd758
Add tracking of basic states in Trainer [wip - to-be-merged after v0.9] (#2541)
* Add initial tracking of states in Trainer.

* Add INTERRUPTED state, improve tests, move state switching from callback to a trainer.

* Move part of a trainer state switching to a decorator.

* Add documentation.

* Fix docs, rename state enum, restore state to previous on exit if None, add tests for decorator only.

* Fix callback typing.

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-09 06:24:09 -04:00
Rohit Gupta 983c030326
fix reduction docstring and clean tests (#2885)
* fix reduction docstring

* Update docstring and some cleanup

* miss

* suggestion from code review

Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai>

Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai>
2020-08-09 06:03:24 -04:00
William Falcon 256059a1d0
tracks all outputs including TBPTT and multiple optimizers (#2890)
* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update

* pl 0.9 update
2020-08-09 06:00:15 -04:00
Rohit Gupta 4d0406ec8b
deepcopy model state_dict in tests (#2887)
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-08 16:13:06 +00:00
Adrian Wälchli f798cffd02
save last model after saving top_k when save_last=True (#2881)
* save_last should be last

* changelog

* seed, docs

* retrigger ci

* compare filenames

* move constants

* fix test

* epoch, global step

* improve test
2020-08-08 06:02:43 -04:00
Jirka Borovec f8c058215f
simplify tests & cleaning (#2588)
* simplify

* tmpdir

* revert

* clean

* accel

* types

* test

* edit test acc

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update test acc

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-07 23:22:05 +02:00
William Falcon f82d7feb6c
updated hooks (#2850)
* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks

* modified hooks
2020-08-07 09:29:57 -04:00
ananthsub b39f4798a6
Add support to Tensorboard logger for OmegaConf hparams (#2846)
* Add support to Tensorboard logger for OmegaConf hparams

Address https://github.com/PyTorchLightning/pytorch-lightning/issues/2844

We check if we can import omegaconf, and if the hparams are omegaconf instances. if so, we use OmegaConf.merge to preserve the typing, such that saving hparams to yaml actually triggers the OmegaConf branch

* avalaible

* chlog

* test

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-07 09:13:21 -04:00
Rohit Gupta a642349228
Support limit_mode_batches (int) for infinite dataloader (#2840)
* Support limit_mode_batches(int) for infinite dataloader

* flake8

* revert and update

* add and update tests

* pep8

* chlog

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Add suggestions by @awaelchli

* docs

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

* Apply suggestions from code review

* fix

* max

* check

* add and update tests

* max

* check

* check

* check

* chlog

* tests

* update exception message

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-07 13:02:36 +02:00
Nima Sarang 793036d29c
Support returning python scalars in DP (#1935)
* Override the default gather method to support scalars

* add computing average of a list

* bug: change if to elif

* add some tests

* change style

* change documentation

* use apply_to_collection in DP gather

* use apply_to_collection in DP gather

* fix warning msg

* override gather method in DP

* add tests for python scalars

* add python scalars to docstring

* Update message

* override gather method in DP

* formatting

* chlog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-07 09:18:29 +02:00
Nicki Skafte 9a402461da
Bugfix: Lr finder and hparams compatibility (#2821)
* fix hparams lr finder bug

* add tests for new functions

* better tests

* fix codefactor

* fix styling

* fix tests

* fix codefactor

* Apply suggestions from code review

* modified hook

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-07 00:34:48 +02:00
Jirka Borovec ed3ee982b3
clean tests imports (#2834) 2020-08-06 16:58:51 +02:00
s-rog 9b997c8616
add test for none checkpoint in ddp_spawn (#2845)
* add test for none checkpoint in ddp_spawn

* fix code style

* make sure checkpoint_callback is none

* Fix tests

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-08-06 07:11:43 -04:00
xmotli02 767c44950c
Added basic file logger (#2721)
* Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* csv

* Apply suggestions from code review

* tests

* tests

* tests

* miss

* docs

Co-authored-by: xmotli02 <xmotli02@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-06 06:08:25 -04:00
Younghun Roh ac4a215071
Faster Accuracy metric (#2775)
* Faster classfication stats

* Faster accuracy metric

* minor change on cls metric

* Add out-of-bound class clamping

* Add more tests and minor fixes

* Resolve code style warning

* Update for #2781

* hotfix

* Update pytorch_lightning/metrics/functional/classification.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update about conversation

* Add docstring on stat_scores_multiple_classes

Co-authored-by: Younghun Roh <yhunroh@mindslab.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-06 11:40:35 +02:00
Justus Schock fe29c53ab5
add ddp sync for logging in result step (#2822)
* add ddp sync for logging in result step

* pep8

* pep8

* make ddp tests run also on cpu (except windowws)

* create class instance in ddp test

* revert automated formatting

* pep8
2020-08-05 20:42:09 -04:00
William Falcon b507c42c47
clarify batch hooks (#2842)
* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook
2020-08-05 20:01:30 -04:00
Ananya Harsh Jha a5f2b89ed0
updated sync bn (#2838)
* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* added ddp_spawn test

* updated test

* clean

* clean

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-06 01:12:11 +02:00
William Falcon 5d0f0325d8
Revert "Support limit_mode_batches (int) for infinite dataloader" (#2839)
* Revert "Support limit_mode_batches (int) for infinite dataloader (#2787)"

This reverts commit de9c9f0864.

* Update training_tricks.py
2020-08-05 15:57:26 -04:00
Jeff Yang 5bbcb8db1f
Improve SSIM (#2833)
* make ssim fast

* remove padding

* pep8

* add comments for readability

* plus -> coef
2020-08-05 13:40:11 -04:00
Rohit Gupta de9c9f0864
Support limit_mode_batches (int) for infinite dataloader (#2787)
* Support limit_mode_batches(int) for infinite dataloader

* flake8

* revert and update

* add and update tests

* pep8

* chlog

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Add suggestions by @awaelchli

* docs

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

* Apply suggestions from code review

* fix

* max

* check

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-05 17:04:49 +00:00
Nicki Skafte e3732789d7
Add remaning sklearn metrics (#2562)
* added balanced accuracy

* added dcg score

* added mean absolute error

* added mean squared error

* fix

* added mean squared log error

* add median absolute error and r2 score

* switch arguments

* added mean poisson deviance

* add mean gamma deviance and mean tweedie deviance

* fix styling

* added explained variance score

* added cohen kappa score

* added hamming, hinge, jaccard

* fix styling

* update sklearn requirement to newer version

* update requirement

* fix doctest

* fix tests

* added balanced accuracy

* added dcg score

* added mean absolute error

* added mean squared error

* fix

* added mean squared log error

* add median absolute error and r2 score

* switch arguments

* added mean poisson deviance

* add mean gamma deviance and mean tweedie deviance

* fix styling

* added explained variance score

* added cohen kappa score

* added hamming, hinge, jaccard

* fix styling

* update sklearn requirement to newer version

* fix doctest

* fix tests

* fix doctest

* fix failing docs

* fix test

* trying to fix errors

* Apply suggestions from code review

* format

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-05 11:32:53 +02:00
Justus Schock ad0f1194aa
Support Mean in DDP Sync (#2568)
* Update converters.py

* Update test_converters.py

* pep8

* pep8 tests

* Update test_datamodules.py

* Update test_converters.py

* Update converters.py

* Update test_datamodules.py

* Update test_converters.py

* Update test_converters.py

* fix tests

* fix ddp tests on windows

* chlog

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-04 18:32:20 +02:00
Jirka Borovec 448be60701
update GPU to PT 1.5 (#2779)
* update gpu PT 1.6

* fix docker

* use PT 1.5

* Update tests/install_AMP.sh

Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>

Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>
2020-08-02 08:14:53 -04:00
Rohit Gupta 8baec1a191
Fix shuffle for distributed sampler (#2789)
* Fix shuffle for distributed sampler

* add test

* test

* chlog

* update test

* update test

* update test

* assertions via callback

* define callback outside for pickling

* skip ddp test on windows

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-01 23:22:57 -04:00
Nathan Raw 036bcea499
Call DataModule hooks implicitly in trainer (#2755)
*  call dm hooks in trainer implicitly

*  update tests

* 📝 remove unused stage arg from dm docs

*  update tests

*  update tests

* 🚧 include stage in datamodule.setup

* 📝 docs

* 📝 docs

* added more dm tests

* added more dm tests

* 🐛 call dm.setup everywhere

* 🔥 pickle tests now implied by accelerator tests

* 🎨 set dm as attr of trainer

* 🐛 .

* 🚧 wip

* add can prepare test

* add can prepare test

* verified setup in fit

* fixed setup call

* fixed setup call

* fixed setup call

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-01 20:17:57 -04:00
Jirka Borovec 3772601cd6
update CI testing with pip upgrade (#2380)
* try pt1.5

* cpu

* upgrade

* tpu

* user

* [blocked by #2380] freeze GPU PT 1.4 (#2780)

* freeze

* user
2020-07-31 14:50:06 -04:00
Jirka Borovec bc7a08fbe0
test dockers & add AMP in pt-1.6 (#1584)
* exist images

* names

* images

* args

* pt 1.6 dev

* circleci

* update

* refactor

* build

* fix

* MKL
2020-07-31 08:23:13 -04:00
Thomas Schaaf a6719f09f0
Bugfix/torchtext include lengths (#2689)
* Test using torchtext.data.Field with include_lengths=True/False

* Fix issue that Tensors in a Batch generated by torchtext with torchtext.data.Field configured as include_lengths=True

* Add description for fix of issue #2688

* changes to accomodate CodeFactor issues

* Another attemt to make last CodeFactor issue pass (it's a false alarm)

* temporarly disable test of test_grad_tracking to check if testing will pass

* reenable test in test_grad_norm

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Renamed get_torchtext_data_iterator to _get_torchtext_data_iterator as suggested by @borda

* Update pytorch_lightning/utilities/apply_func.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* adding tests more specific to batch_move_data_to_device with tochtext Batch

* added check that Tensors were moved to target device

* removed tests using RNN models to be moved into a separate PR

* fixing FLAKE8 errors that showed up after merge from master branch
	modified:   tests/base/datamodules.py
	modified:   tests/callbacks/test_model_checkpoint.py

* parameterized test to reduce code duplication

* Added check only if length tensor exist. Removed left over comments.

* rearranged device parameterization and added pytest.param

* Try to figure out why only one device is tested on Linux machines

* Testing on CPU and GPU devices (GPU test is skip if no cuda device is available.

* added test for TPU device (experimental)

* Adding test parameterization for TPU test (experimental)

* change import statement to limit what is imported for a TPU environment

* made test work with TPU

* Change to trigger CI

* Change to trigger CI

* uncommented TPU test to check CI

* reenabling TPU test

* small change to trigger CI build

* small change to trigger CI build

* small change to trigger CI build

* adding tests/utilities/test_apply_func_torchtext.py to CI TPU test

* try to make test not skipped on CI with TPU

* remove testing on TPU

* undo an accidental change to test_tpu.py (file should not have been touched)

* small change to trigger CI build

* small change to trigger CI build

* Update tests/utilities/test_apply_func_torchtext.py

* Revert to previous version

* Apply suggestions from code review

* Change to trigger CI

Co-authored-by: Thomas Schaaf <tschaaf@mmm.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Thomas Schaaf <tschaaf@cs.cmu.edu>
2020-07-31 07:53:08 -04:00
Lezwon Castelino b7afac351b
Add onnx export (#2596)
* export model to onnx

* prepare data before exporting

* support for dataloaders and tensors

* added tests

* use example_input_array
add to changelog

* updated docstring

* added onnx inference tests

* temp commit

* removed schema valid test

* add onnxruntime to environment.yml

* moved onnxruntime to environment.yml pip

* add example in doc

* add lines between code block

* added PR to changelog

* is file check

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* remove *

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* infer example outputs

* added doctest for onnx

* fix windows tests

* moved eval within condition block

* self.forward to self

* added docs

* fixed docs error

* added to toctree

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-31 12:27:57 +02:00
Jirka Borovec 06e8910f06
pytorch 1.6 (#2745)
* pt 1.6

* don't use the new zipfile serialization for now

* quick flake8 fixes

* remove unnecessary f

* coalesce strings

* remove comma

* remove extra commas

* Apply suggestions from code review

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* set _use_new_zipfile_serialization to False only for pytorch 1.6.0

* remove unnecessary comments

* flake8 fixes

* use pkg_resources instead of packaging

* readme

* format

* version

* chlog

Co-authored-by: Peter Yu <peter@asapp.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
2020-07-31 11:18:32 +02:00
Jirka Borovec 949734489a
remove deprecated in v0.9 (#2760)
* remove deprecated in v0.9

* data_loader

* import

* hook

* args
2020-07-30 23:19:28 +02:00
Phil 2f0fb34496
Speed up gradient clipping and allow parameters on multiple devices. (#2767)
The speed up is achieved by:
- Moving the "where" out of the loop (and replacing with min for simplicity).
- Replacing manual sum and pow with torch.norm. Even though this results
  in unnessecary computation (computing pow(root)) this is still a lot
  faster.
- Preallocating the output gives a slight speed up.

Note that calling .to for all parameters results in a small speed
penalty (~4 ms in my case) but allows parameters on different devices.

Overall this reduces the time used for gradient clipping from 206ms to
74 ms for my model (Resnet50 + few additional vars, all vars on GPU).
2020-07-30 11:53:24 -04:00
Ethan Harris 458d3e210e
Add missing methods to logger collection (#2723)
* Add missing methods to logger collection

* Update CHANGELOG.md

* Fix errors after merge

* Fix codefactor issues

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-29 23:53:02 +02:00
Jirka Borovec 590e7fb1fd
tests: add default_root_dir=tmpdir (#2392)
* tests: add default_root_dir=tmpdir

* remove duplicate tmpdir args

* add missing fixture

* test requires multi gpu

* typo

* resize

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-07-28 09:47:53 -04:00
Jirka Borovec 0fe933e23d
fixing TPU tests (#2632)
* init

* rename

* tpu_core_idx

* idx 8

* idxs

* @pl_multi_process_test

* assert

* assert

* deamon

* no close

* imort

* msg

* use_single_gpu

* dataset

* idx

* fix idx

* dataset

* format

* add pickable

* typo

* apex

* typo

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* docs

* typo

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* docs

* docs

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

* docs

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-07-27 19:07:09 -04:00
Rohit Gupta 84c507c4df
Fix max_batches with fast_dev_run. (#2581)
* Fix fast_dev_run to run for all val_dataloaders

* fast_dev_run check

* changelog

* explicit

* limit_batches with fast_dev_run in init

* add test

* whitespace and comment fix

* comment and assertion

* added tests

* Fix fast_dev_run to run for all val_dataloaders

* fast_dev_run check

* changelog

* explicit

* limit_batches with fast_dev_run in init

* add test

* whitespace and comment fix

* comment and assertion

* added tests

* added tests

* added tests

* added tests

* update rtol

* Revert "update rtol"

This reverts commit 4320329540.

* added tests

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-27 17:56:55 -04:00
Adrian Wälchli d03953260d
Fix weights_save_path when logger is used + simplify path handling + better docs (#2681)
* fix weights_save path and drop ckpt_path

* add tests

* unused import

* update docs

* changelog

* pep8

* fix horovod test

* make backward compatible

* perform same test for all loggers

* fix for when logger=False and weights_save_path is set

* update changelog

* update docs

* update tests

* do not set save dir dynamically

* remove duplicate test

* remove duplicated tests

* update tests

* update tests

* remove remaining ckpt_path references

* move defaults to init as suggested by @Borda

* test deprecation
2020-07-27 12:53:11 -04:00
William Falcon 4dbd761a1c
refactor 3/n (#2709)
* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator
2020-07-25 20:56:50 -04:00
Nathan Raw 9076551aec
Enable val/test loop disabling + datamodule tests (#2692)
* 🎨 warn instead of error out on loaders

* 🐛 test misconfiguration should still fail

* 🚧 .

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

* updated docs with new result obj

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-25 12:57:40 -04:00
Rohit Gupta cb0c6ad51a
fix setup call while testing (#2624)
* fix setup call while testing

* changelog

* drop if condition

* add test to check setup call

* flake8

* update test to check model stage

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-24 23:57:31 -04:00
Nathan Raw 1caf8beb2c
Datamodule (#2668)
*  Add copy of pl_bolts datamodule to lightning

*  add datamodule to necessary init files

* 🚧 add datamodule property to LightningModule

* 🚧 .

* 🎨 Let DataModule do its own thing

* 🚧 add back setup and run both hooks implicitly

* 🚧 .

* 🐛 fix add_argparse_args

* 💄 apply black formatting and isort

* 📝 docstrings

* 📝 .

* 📝 .

* 🐛 overwrite cls prepare_data instead of instance

* 📝 .

*  add some tests

* Update datamodule.py

* Update datamodule.py

* Update datamodule.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-24 11:42:15 -04:00
Adrian Wälchli 938ec5a6c1
remove duplicate tests (#2685)
* remove duplicate test

* remove duplicated tests
2020-07-24 08:15:40 -04:00
Travis Addair 1369012bc7
Horovod: adjust base LR used by schedulers to scale with the number of workers (#2626)
* Horovod: Adjust base LR used by schedulers to match that of the optimizer after scaling by number of workers

* Added unit test

* Removed debug statements

* Updated changelog

* Apply suggestions from code review

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-23 12:14:57 -04:00
Jeff Yang bda7cf1653
metrics: add SSIM (#2671)
* metrics: add SSIM

* Update CHANGELOG.md

fix codefactor issue

fix doctest

fix doctest

fix test

* added test for raise Error
2020-07-23 12:13:52 -04:00
Adrian Wälchli 1e68968ed7
support num_sanity_val_steps=-1 (#2246)
* support sanity_val_step=-1

* fix list size

* simplification

* simplify

* add test for num_sanity_val_steps=-1

* update test

* update docs

* extend tests to multiple dataloaders

* changelog

* Update tests/trainer/test_trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* improve test

* refactor the sanity check decision

* fix merge

* Update trainer.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-23 07:07:03 -04:00
William Falcon 62ce00f96c
EvalResult support for val loop (PR 3/5) (#2651)
* add EvalResult to support to val/test loops
2020-07-22 13:53:10 -04:00
Jeff Yang 0a65826462
metrics: add BLEU (#2535)
* metrics: added bleu score and test bleu

* metrics: fixed type hints in bleu

* bleu score moved to metrics/functional/nlp.py

* refactor with torch.Tensor

* Update test_sequence.py

* refactor as Borda requests and nltk==3.2

* locked nltk==3.3

* nltk>=3.3, parametrized smooth argument for test

* fix bleu_score example

* added class BLEUScore metrics and test

* added class BLEUScore metrics and test

* update CHANGELOG

* refactor with torchtext

* torchtext changed to optional import

* fix E501 line too long

* add else: in optional import

* remove pragma: no-cover

* constants changed to CAPITALS

* remove class in tests

* List -> Sequence, conda -> pip, cast with tensor

* add torchtext in test.txt

* remove torchtext from test.txt

* bump torchtext to 0.5.0

* bump torchtext to 0.5.0

* Apply suggestions from code review

* ignore bleu score in doctest, renamed to nlp.py

* back to implementation with torch

* remove --ignore in CI test, proper reference format

* apply justus comment

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-22 09:58:24 -04:00
Adrian Wälchli a5538af355
fix dtype/device property not getting updated in submodules (#2657)
* recursive dtype device apply

* simplify

* simple test

* submodule test

* rename

* explicit

* type hints

* test for dp backend

* fix test skip

* rename

* add ddp_spawn test

* fix None index in test

* try fix ddp_spawn test

* changelog

* move _dtype and _device to mixin

* additional doctest
2020-07-21 15:18:57 -04:00
William Falcon 6d10ac2ac8
Structured results (train loop only. val loop separate PR) (PR 2/5) (#2615)
* r

* r

* r

* patched optimizer closure with sr

* patched optimizer closure with sr

* patched optimizer closure with sr

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added train step structured result

* added autoreduce for train step

* added auto reduce on train

* added auto reduce on train

* added auto reduce on train

* added auto reduce on train

* added auto reduce on train

* added auto reduce on train

* added hooks

* added hooks

* added hooks

* added hooks

* added hooks

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* cache

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* Update pytorch_lightning/callbacks/early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

* Update pytorch_lightning/core/step_result.py

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* simple

* finished tests for structured results on train epoch

* simple

* simple

* revert

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* Update tests/base/deterministic_model.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* finished tests for structured results on train epoch

* docstring typos

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* finished tests for structured results on train epoch

* Update pytorch_lightning/core/step_result.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update pytorch_lightning/overrides/data_parallel.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-07-20 19:00:20 -04:00
William Falcon aaa1553e35
tests for val loop flow (#2605)
* add tests for single scalar return from training

* add tests for single scalar return from training

* add tests for single scalar return from training

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only

* fixing val step only
2020-07-14 14:20:45 -04:00
William Falcon 1d565e175d
add tests for single scalar return from training (#2587)
* add tests for single scalar return from training

* add tests for single scalar return from training

* add tests for single scalar return from training

* add tests for single scalar return from training

* add tests for single scalar return from training
2020-07-11 17:43:00 -04:00
Jirka Borovec 458bbad550
Avoid zeros in dice and iou (#2567)
* nones

* fix

* fix

* test

* test

* test

* fix

* eps

* tpu

* eps

* type

* test tpu

* Update __init__.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-09 20:40:10 -04:00
William Falcon f35337adba
Fixes .test() for ddp (#2570)
* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint
2020-07-09 18:36:36 -04:00
William Falcon b73812648f
don't pass tpu weights back on test (#2566)
* enable none checkpoint

* enable none checkpoint
2020-07-09 12:11:56 -04:00
Rohit Gupta 6f4a488bae
Add functional regression metrics (#2492)
* Add functional regression metrics

* add functional tests

* add docs

* changelog

* init

* pep8

* docs

* docs

* setup docs

* docs

* Apply suggestions from code review

* Apply suggestions from code review

* typo

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-09 17:54:38 +02:00
William Falcon 4bbcfa04a3
.fit() returns last not best weights in ddp_spawn (#2565)
* added base tests for tpu

* added base tests for tpu

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint
2020-07-09 11:36:21 -04:00
Adrian Wälchli f16b4cfc52
save_dir fix for MLflowLogger + save_dir tests for others (#2502)
* mlflow rework

* logger save_dir

* folder

* mlflow

* simplify

* fix test

* add a test for file dir contents

* new line

* changelog

* docs

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* test for comet logger

* improve mlflow checkpoint test

* prevent  commet logger error on pytest exit

* test tensorboard save dir structure

* wandb save dir test

* skip test on windows

* add mlflow to pickle tests

* wandb

* code factor

* remove unused imports

* remove unused setter

* wandb mock

* wip mock

* wip mock

* wandb tests with mocking

* clean up

* clean up

* comments

* include wandblogger in test

* clean up

* missing argument

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-09 07:15:41 -04:00
Hayden Housen 992a7e2a41
Start accumulate gradients schedule at epoch 0 (continued) (#2513)
* Start accumulate gradients schedule at epoch 0

* Undo change in #2375

* Update test_trainer.py::test_gradient_accumulation_scheduling

* Fix pep8 formatting

* Remove 'Datasets/' folder

* Split args for readability

* Fix pep8 formatting
2020-07-09 07:11:07 -04:00
Espen Haugsdal b3ebfec863
Fix argparse default value bug (#2526)
* Add failing test for bug

* Fix bug
2020-07-09 07:10:30 -04:00
William Falcon a95ef5a4ac
remove parameterize from TPU tests (#2561)
* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu
2020-07-09 06:46:07 -04:00
William Falcon 69cbb62774
Finish #2549 (#2557)
* removed spawns for test_converters and verified tests

Co-authored-by: Ananya Harsh Jha <ahj265@nyu.edu>
Co-authored-by: zcain <zcain@google.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-08 20:33:48 -04:00
Rohit Gupta d3f5717e81
Fix parameters and docs in metrics (#2473)
* Fix parameters and docs in metrics

* doc improvements

* whitespace

* doc indentation

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* zero

* drop defaults

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-08 14:11:40 +02:00
Marijan Smetko 1dc724239a
PSNR metric (#2483)
* Add stub PSNR metric

* Fix linter

* Add data range as parameter

* Add tests

* Add scikit-image

* Add PSNR to regression metrics and add functional

* Refactor to functional

* Fix linter

* Fix linter, again

* Fix linter, again

* Fix typo in test

* Fix typo in another test

* Add scikit-image to conda

* Lift numpy requirement

* Add random tests

* Update CHANGELOG.md

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-08 10:26:11 +02:00
Anthony Bisulco 899cd74044
flatten Wandb hyperparameters dict (#2459)
* wandb logging fix

* Changelog fix

* change test
2020-07-08 07:45:25 +02:00
Adrian Wälchli 78db847e42
Fixed skipped horovod tests (#2514)
* skip ckpt test on rank  > 0

* fx test

* add extra assert

* code factor

* add back removed

* add old loading code

* add back old

* unused import

* add same skip to run_model_without_loggers

* test if horovod now works with python 3.8

* test remove all 3.8 skips

* remove spawn

* fix

* fix test

* move load check up

* fix test multigpu

* rename

* fix gpu mode

* on gpu fix when on cpu

* move
2020-07-07 14:54:07 -04:00
William Falcon 11069c8784
Fix ddp tests + .test() (#2512)
* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* fix deprecation warnings

* added base tests for tpu

* added base tests for tpu

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

* added base tests for tpu

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-07-07 12:24:56 -04:00
Jirka Borovec 977df6ed31
Docker: building XLA base image (#2494)
* refactor

* add TPU base

* wip

* builds

* typo

* extras

* simple

* unzip

* rename
2020-07-06 14:21:36 -04:00
Jeremy Jordan a91b06ed1e
fix worker warning (#2504)
* fix worker warning

* improve tests

* suggestion

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-06 15:45:43 +02:00
vr140 96b32bee04
[tiny] Fix training_dataloader usage to be train_dataloader instead. (#2521)
Co-authored-by: Vijay Rajaram <vrajaram3@gatech.edu>
2020-07-06 10:44:44 +02:00
Adrian Wälchli 1098a0d725
make loggers pickleable (#2518)
* state updates to logger

* change log

* changelog
2020-07-05 19:57:22 -04:00
Adrian Wälchli 6bfcfa8671
fix dtype conversion of example_input_array in model summary (#2510)
* fix dtype conversion

* changelog
2020-07-05 07:17:22 -04:00
William Falcon 9924c76faa
Amp2 (#2505)
* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang

* fix tpu hang
2020-07-04 22:52:49 -04:00
William Falcon 020c332ae9
Clean up (#2467)
* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* Fixes #2455

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test

* added early stop tpu test
2020-07-03 00:38:29 -04:00
Adrian Wälchli 927f305f7e
Warn user when IterableDataset has __len__ defined (#2437)
* add warning when getting checking len

* added test

* changelog

* pep

* do not show warning below 1.4

* try version parse

* comments

* xfail

* Update requirements/base.txt

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/data_loading.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* version

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-01 07:53:19 -04:00
Adrian Wälchli 145670f893
fix logging on rank 0 only (#2425)
* fix and test for ddp block logging rank > 0

* rename

* use the dummy logger

* dummy logger test

* set the logger in  model

* decorator for rank zero experiment

* simplify check

* simplify

* fix problem with None in checkpoint path

* revert configure logger

* unused import

* offline

* try rank 0 decorator in checkpoint

* try fix test

* imgs

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* fix tpu tests

* fix tpu tests

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-30 18:09:16 -04:00