* allow loading checkpoints from urls
* tmpdir_server fixture
* test cases for loading checkpoints from url
* dir => root_dir
* default map_location to None
* test case for resume_from_checkpoint
* changelog
* doc update
* monkeypatch TORCH_HOME to avoid caching
* Use a threading server with random ports so that it is easier to clean up
* test fixes
* pep8 fix
* ThreadingHTTPServer support in 3.6
* pep8 fix
* fix changelog
* separate tests for urls
* typo
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* training batch clean up
* adding spawn
* adding spawn
* adding spawn
* adding spawn
* adding spawn
* adding spawn
* adding spawn
* adding spawn
* do not include local vars in auto collection
* add test
* add test for model with "self" renamed to "obj"
* skip decorator
* changelog
* changelog
* update docs
* remove obsolete child collection
* generalize **args, **kwargs names
* docs
* also update varargs passed in
* Revert "also update varargs passed in"
This reverts commit 3d7a30dbee07a513ee13e1cc3e08ca5ccdb85734.
* update test
* black
Added throught black.toml other options are hard so far
No caching for black github action
Moved from black.toml to pyproject.toml
Exclude not only yml but also yaml
Update pyproject.toml
Co-authored-by: Thomas Johansen <thomasjo@gmail.com>
Update .github/workflows/code-formatting-check.yml
mergify
Remove formating check
E231 error ignoring because of black formating
Updated CONTRIBUTING to the master
* Update .github/workflows/code-formatting-check.yml
* Bump black to 19.10b0 version
* resolved incorrect merge of CONTRIBUTING,
Black skipping string normalization
* Minor fixes in CONTRIBUTING, two typos
* Update setup.cfg
* chlog
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
* refactor and added hook
variant a
variant b
add test
revert rename
add changelog
docs
* resolve merge duplication
* overridden typo
* fix test
* tpu id
* raise if TPU not available
* re-use apply_to_collection function for parsing collections
* comment
* make utility function available to user
* documentation
* move changelog entry to top
* fix tpu transfer call
* fix call
* remove hardcoded string
* improve test
* call model hook by default
* Apply suggestions from code review
* rename utility function
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Raise an error when lightning replaces an existing sampler
Currently, Trainer replaces the existing sampler with DistributedSampler
if running distributing training and `replace_sampler_ddp=True` (default
behaviour). If a user has configured an existing sampler, this would
lead to widely different results if running a distributed vs
non-distributed training.
This PR fixes this by raising an Error if user has configured a sampler
and uses `replace_sampler_ddp=True`. The recommended behavior from now
on is to either remove the sampler or set `replace_sampler_ddp=False`
* Fix tests
* Simpler fix
* Fix tests
* Make inner method protected
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* fix grad norm formula
* grad-norm tracker test
* fixed seed and explicit rtol in grad norm tracking test
* a docstring for grad-norms and forced cast to float of norm_type
* support for inf-norm
* renamed the grad norm test
* docs
* fixed language in docstring
* Apply suggestions from code review
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* use parallel loader
* Revert "use parallel loader"
This reverts commit ed6e7583
* select tpu id for pl
* condition if tpu_id is None
* added info to changelog
* Revert "condition if tpu_id is None"
This reverts commit 1fb6e586
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* fix(wandb): use same logger on multiple training loops
New training loops reset step to 0 which would previously try to overwrite logs
fix#2015
* docs(changelog.md): add reference to PR 2055
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* replace ddp spawn with subprocess
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* hot fix
* Patch for issue 1815, which will allow EarlyStopping to work on precision=16
* Added a whitespace to the end of the line so CICD can rerun. No reason for the latest macos test to have been cancelled.
* Format.
* Add an additional attribute to ModelCheckpoint to keep track of the best model's path
Currently, only the best metric value is directly tracked. This new attribute will help in uses cases where the trained model needs to be used or tracked right after training.
* Add small description and usage example to docs
* Fix PEP8 issues
* Fix doctest example
* Fix expected output in doctest
* Apply suggestions from code review
* Show example as code block instead of doctest
* Apply suggestions from code review
* Update CHANGELOG.md
* Rename `ModelCheckpoint.best` to `ModelCheckpoint.best_model_score`
Also rename `ModelCheckpoint.best_model` (added in this PR) to `ModelCheckpoint.best_model_path`, for consistency, and `kth_best_model` to `kth_best_model_path`.
* Update pytorch_lightning/trainer/training_io.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Add warning when loading checkpoint from an old version
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* 🐛 fixed fake example type assigning and hparams arg
* fixed GAN example to work with dp, ddp., ddp_cpu
* Update generative_adversarial_net.py
Co-authored-by: William Falcon <waf2107@columbia.edu>
* fix chlog
* test for #1729
* hist
* update
* Document use case of passing test dataloaders to Trainer.test() (#1992)
* Issue 1990 Doc patch.
* Codeblock directive.
* Update to reflect current state of pytorch-lightning
* Final grammar cleaning. I hope these commits are squashed.
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: authman <uapatira@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>