* fix result for dp
* fix result for dp
* fix result for dp
* fix result for dp
* fix result for dp
* fix result for dp
* fix result for dp
* fix result for dp
* fix result for dp
* fix result for dp
* fix result for dp
* fix result for dp
* added warning when changing monitor and using results obj
* added warning when changing monitor and using results obj
* added warning when changing monitor and using results obj
* added warning when changing monitor and using results obj
* add ddp script variations
* add ddp test
* rename
* shell
* test
* test
* try call
* try without subprocess
* test
* display the error
* list all variations
* try string
* try copy env
* debug
* pythonpath
* path
* update test
* change
* simple ddp test
* replace
* remove random port
* random port
* str
* clean up
* check run spawn
* clean up
* docs
* docs
* update test
* docs
* changelog
* changelog
* add val step arg to metrics
* add val step arg to metrics
* add val step arg to metrics
* add val step arg to metrics
* add val step arg to metrics
* add val step arg to metrics
* add val step arg to metrics
* add val step arg to metrics
* add val step arg to metrics
* add step metrics
* add step metrics
* override dist backend when using tpus
* added test
* updated doc string
* drop redundant info...
* more redundant info
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
This function has the if statement `if (train_dataloader or val_dataloaders) and datamodule:`.
The issue is similar to that in https://github.com/PyTorchLightning/pytorch-lightning/pull/1560. The problem is that the `if(dl)` translates to `if(bool(dl))`, but there's no dataloader.__bool__ so bool() uses dataloader.__len__ > 0. But... dataloader.__len__ uses IterableDataset.__len__ for IterableDatasets for which __len__ is undefined.
The fix is also the same, the `if dl` should be replaced by `if dl is not None`.
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* fix missing return statement. Do not normalize remote paths
* Update pytorch_lightning/utilities/cloud_io.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Add some documentation that we now support s3 and hdfs paths
* suggestion from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Update lr_logger.py
when logging learning_rate, we should provide different choices to log including 'step' and 'epoch'
* Update lr_logger.py
add some type annotations and docstrings
* Update lr_logger.py
fixed a bug where `on_train_batch_start()` can't be triggered, instead, we should use on_batch_start(); add `interval` args so that we can record learning_rates with respect to `global_step` or `current_epoch`.
* Update lr_logger.py
restore _extract_lr()
* suggestion
* Update lr_logger.py
modify _extract_lr(), it no more need to pass `interval` parameter.
* Update test_lr_logger.py
SkafteNicki 's suggetion
* log_interval now supports `None`, `step`, `epoch`
* change `log_interval` to `logging_interval`
* Update test_lr_logger.py
* Update lr_logger.py
* put types check into `on_train_start()`
* cleanup
* docstring typos
* minor changes from suggestions
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Add initial tracking of states in Trainer.
* Add INTERRUPTED state, improve tests, move state switching from callback to a trainer.
* Move part of a trainer state switching to a decorator.
* Add documentation.
* Fix docs, rename state enum, restore state to previous on exit if None, add tests for decorator only.
* Fix callback typing.
Co-authored-by: William Falcon <waf2107@columbia.edu>
commit 29fb0506cd38a15c359e369cc8bc4435916b0c78
Author: Brendan Fahy <bmfahy@gmail.com>
Date: Sat Aug 8 19:35:30 2020 +0000
fix checking for version for docs to build
commit 467fd640db02275972c7111af031c86bb59333e9
Author: Brendan Fahy <bmfahy@gmail.com>
Date: Sat Aug 8 18:56:05 2020 +0000
remove no local test
commit a7cc9f88de00feec1a5406874d05313c42bd004c
Author: Brendan Fahy <bmfahy@gmail.com>
Date: Sat Aug 8 18:46:44 2020 +0000
fix
commit 3fdbb729da79ae9348c83410a138666bad467951
Author: Brendan Fahy <bmfahy@gmail.com>
Date: Sat Aug 8 18:23:30 2020 +0000
revert requirements
commit 9b8686bd83e2bc243cf329e26f1c667c6949cf67
Author: Brendan Fahy <bmfahy@gmail.com>
Date: Sat Aug 8 18:16:42 2020 +0000
make it a fixture
commit eec74953d24c8b25268d3b6dde3cc4affdd5cb8f
Author: Brendan Fahy <bmfahy@gmail.com>
Date: Sat Aug 8 18:01:32 2020 +0000
fix up the testing
commit 896d94a0e60083d52c81db2a036b7f1e015cad11
Author: Brendan Fahy <bmfahy@gmail.com>
Date: Sat Aug 8 17:47:28 2020 +0000
fix some tests
commit 6d22bde19767bf2b71dfd44839b01efdf6888f83
Merge: 6175d4e2 6ebe0d72
Author: Brendan Fahy <bmfahy@gmail.com>
Date: Sat Aug 8 10:20:47 2020 +0000
Merge remote-tracking branch 'origin/master' into tb_use_gfile
commit 6175d4e26b15a43c412c26d501762cd0b570616a
Author: Brendan Fahy <bmfahy@gmail.com>
Date: Fri Aug 7 10:16:36 2020 +0000
Use tensorboard.compat.gfile to support remote writing